Phi-3.5-vision-instruct
Unknown Developer
Phi-3.5-vision-instruct is an open-source multimodal model boasting 4.2 billion parameters and a large 128K context window. Designed with a focus on advanced image understanding and logical reasoning, it excels in both single-image tasks and complex multi-image analysis, including comparison, summarization, and video processing. Enhanced with safety-focused post-training, the model demonstrates improved instruction-following, alignment, and resilience when processing diverse visual and textual data. It is available for use under the permissive MIT license.
Model Specifications
Technical details and capabilities of Phi-3.5-vision-instruct
Core Specifications
4.2B Parameters
Model size and complexity
500.0B Training Tokens
Amount of data used in training
128.0K / 128.0K
Input / Output tokens
September 30, 2023
Knowledge cutoff date
August 22, 2024
Release date
Performance Insights
Check out how Phi-3.5-vision-instruct handles various AI tasks through comprehensive benchmark results.
Detailed Benchmarks
Dive deeper into Phi-3.5-vision-instruct's performance across specific task categories. Expand each section to see detailed metrics and comparisons.
Non categorized
MMMU
MathVista
AI2D
ChartQA
Providers Pricing Coming Soon
We're working on gathering comprehensive pricing data from all major providers for Phi-3.5-vision-instruct. Compare costs across platforms to find the best pricing for your use case.
Share your feedback
Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.
Your feedback helps us improve our service