X has had its own AI chatbot, Grok, for a while, but it would be fair to say it's not mentioned in the same way that OpenAI's ChatGPT or Google Gemini are. That's not for the want of trying, though, and with a huge user base of X users providing data for the model, a new version was always expected.
Now, the obviously-named Grok-2 has entered beta. In a new blog post, X says it represents "a significant step forward from our previous model Grok-1.5, featuring frontier capabilities in chat, coding, and reasoning. At the same time, we are introducing Grok-2 mini, a small but capable sibling of Grok-2. An early version of Grok-2 has been tested on the LMSYS leaderboard under the name "sus-column-r." At the time of this blog post, it is outperforming both Claude 3.5 Sonnet and GPT-4-Turbo."[1]
Grok-2 outperforms Claude 3.5 Sonnet and GPT-4-Turbo
So, what's new? As the graph above shows, the overall Elo score for an early model of Grok-2 beats out every comparable chatbot except for ChatGPT-4o and Google Gemini.
X also says that Grok-2 and its Mini counterpart "achieve performance levels competitive to other frontier models in areas such as graduate-level science knowledge (GPQA), general knowledge (MMLU, MMLU-Pro), and math competition problems (MATH)," while also pointing to vision-based tasks as an area of improvement.
Grok will also gain a new interface on X, as well as the option to generate images with a prompt. This is achieved through the integration of the popular Flux AI image generation model from Black Forest Labs.
This article is shared at no charge for educational and informational purposes only.
Red Sky Alliance is a Cyber Threat Analysis and Intelligence Service organization. We provide indicators of compromise information via a notification service (RedXray) or an analysis service (CTAC). For questions, comments or assistance, please contact the office directly at 1-844-492-7225, or feedback@redskyalliance.com
Weekly Cyber Intelligence Briefings:
- Reporting: https://www.redskyalliance.org/
- Website: https://www.redskyalliance.com/
- LinkedIn: https://www.linkedin.com/company/64265941
Weekly Cyber Intelligence Briefings:
REDSHORTS - Weekly Cyber Intelligence Briefings
https://register.gotowebinar.com/register/5378972949933166424
[1] https://www.tomsguide.com/ai/elon-musk-drops-grok-2-the-x-based-ai-chatbot-is-now-more-powerful-and-can-make-images
Comments