Solar Pro Preview:
The most intelligent LLM on a single GPU

 
 

Businesses face a critical choice when deciding how to use AI: integrate it into their current systems, or experiment through APIs. At Upstage, we understand that many companies prefer to leverage their existing hardware setup. Since being at the forefront of AI innovation since 2020, Upstage has been delivering powerful solutions to global enterprises like Samsung, Intel, and SK. Businesses like these want to stay ahead of the curve, but the journey from proof-of-concept to production can be challenging and expensive.

That’s why we are releasing Solar Pro Preview — a sneak peek into November’s full version of Solar Pro, the state-of-the-art model designed to deliver exceptional performance on a single GPU — enabling businesses to harness the power of AI without overhauling their infrastructure or sending their data to external APIs.

Solar Pro Preview is available as an open-source model for public use, including commercial applications. We now want to get this into the hands of developers, researchers, and businesses to start building apps. We believe that by collaborating with these communities, we can accelerate the realization of the full potential of AI on a larger scale, and achieve our mission of Making AI Beneficial. We invite these groups to explore the capabilities of Solar Pro Preview and share their experience with us. Your input will play a crucial role in shaping the future of this powerful language model and helping us deliver even more value to the industry.

Best-in-class intelligence, designed to fit a single GPU

We understand that you want the best AI has to offer, but you also need to consider your bottom line. Solar Pro Preview offers the perfect balance. To support this, we conducted a performance analysis comparing Solar Pro Preview with other open models with the minimum GPU memory, or VRAM, requirements in mind. Our experimental findings indicate that among models that can be deployed on a single GPU, Solar Pro Preview shows superior performance in intelligence and instruction-following capabilities, as measured by MMLU-Pro and IFEval scores respectively. Additionally, Solar Pro Preview demonstrated comparable performance even against models requiring multiple GPUs, such as Llama 3.1 70B.

 

(*) VRAM usage was measured via torch.cuda.max_memory_reserved() considering generation overhead, with context window token lengths set to 4k. Our setup utilized transformers v4.44.2 and StaticCache for consistent key-value caching. Note that real production environments may incur additional gigabytes of overhead.

 
 

These results showcase how our depth up-scaling (DUS) method and advanced data recipe enable Solar Pro Preview to achieve cutting-edge performance while optimizing resource utilization. Also, we recognize that optimizing models, including techniques like quantization, plays a critical role in enabling larger models to run on single GPUs. While research in this area is ongoing, we invite the community to collaborate with us in identifying the most effective and practical methods for deploying models across various GPU architectures. For those interested in reproducing our experimental results or seeking more detailed information about the model's architecture, initial weight configurations, or training methods, please visit the model card.

Easily apply to your use cases

Getting started with Solar Pro Preview is simple. Choose from the integration options below.


1. Integrate directly with our open model: Here's a quick code snippet to begin integrating the open model from HuggingFace into your projects.

# Install requirements
# !pip install transformers==4.44.2 torch==2.3.1 flash_attn==2.5.8 accelerate==0.31.0
 
# Load model
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
 
tokenizer = AutoTokenizer.from_pretrained("upstage/solar-pro-preview-instruct")
model = AutoModelForCausalLM.from_pretrained(
    "upstage/solar-pro-preview-instruct",
    device_map="cuda",  
    torch_dtype="auto",  
    trust_remote_code=True,
)
 
# Apply chat template
messages = [
    {"role""user""content""Please, introduce yourself."},
]
prompt = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
 
# Generate text
outputs = model.generate(prompt, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))
cs

2. Explore the API from our Console: The Solar Pro Preview API is available for free until its official release this November. You can utilize Upstage Console to access it and send requests through curl, as well as implement different integrations like LangChain.

curl --location 'https://api.upstage.ai/v1/solar/chat/completions' \
--header 'Authorization: Bearer UPSTAGE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
  "model": "solar-pro",
  "messages": [
    {
      "role": "user",
      "content": "Describe how we plan to leverage Upstage products to achieve your mission of AGI for Work."
    }
  ],
  "stream": true
}'
cs

3. Use the AWS marketplace: Solar Pro Preview is also available on the AWS Marketplace and can be deployed on your own AWS infrastructure. As a special offer, the model is available for free until the official Solar Pro release in November 2024, so you only pay for the AWS infrastructure costs. Visit the AWS Marketplace page for more details and a tutorial notebook to help you get started quickly and easily.

Explore more and partner with Upstage

We believe Solar Pro Preview not only stands out as a highly efficient and capable model but has the potential to be further extended to cover more languages and capabilities. Solar Pro Preview is limited to only English and 4k context windows, but the official version of Solar Pro will be released this November 2024 with expanded language support and longer context windows. To stay informed about the latest updates, please follow us on LinkedIn or Twitter. If you have any feedback or questions about the model, please visit our model discussion board and connect with us directly.


Looking ahead, our upcoming model will feature multimodal capabilities to change the way you interact with complex documents — such as automatic key information extraction or effortless multi-page document summarization. Get a preview of what's to come by trying out Solar DocVision Preview, an experimental vision LLM specialized for documents.

Upstage is dedicated to advancing AI to transform how enterprises operate. If you're eager to accelerate your LLM adoption, please reach out to us to discuss partnership possibilities.

 

Learn more

Also try out

 

Building Tomorrow’s Solutions Today

Talk to AI expert to find the best solution for your business.