Vision API, REST: Vision.batchAnalyze
Analyzes a batch of images and returns results with annotations.
HTTP request
POST https://vision.api.cloud.yandex.net/vision/v1/batchAnalyze
Body parameters
{
"analyzeSpecs": [
{
"features": [
{
"type": "string",
// `analyzeSpecs[].features[]` includes only one of the fields `classificationConfig`, `textDetectionConfig`
"classificationConfig": {
"model": "string"
},
"textDetectionConfig": {
"languageCodes": [
"string"
],
"model": "string"
},
// end of the list of possible fields`analyzeSpecs[].features[]`
}
],
"mimeType": "string",
// `analyzeSpecs[]` includes only one of the fields `content`, `signature`
"content": "string",
"signature": "string",
// end of the list of possible fields`analyzeSpecs[]`
}
],
"folderId": "string"
}
Field | Description |
---|---|
analyzeSpecs[] | object Required. A list of specifications. Each specification contains the file to analyze and features to use for analysis. Restrictions:
The number of elements must be in the range 1-8. |
analyzeSpecs[]. features[] |
object Required. Requested features to use for analysis. Max count of requested features for one file is 8. The number of elements must be in the range 1-8. |
analyzeSpecs[]. features[]. type |
string Type of requested feature.
|
analyzeSpecs[]. features[]. classificationConfig |
object Required for the CLASSIFICATION type. Specifies configuration for the classification feature. analyzeSpecs[].features[] includes only one of the fields classificationConfig , textDetectionConfig |
analyzeSpecs[]. features[]. classificationConfig. model |
string Model to use for image classification. The maximum string length in characters is 256. |
analyzeSpecs[]. features[]. textDetectionConfig |
object Required for the TEXT_DETECTION type. Specifies configuration for the text detection (OCR) feature. analyzeSpecs[].features[] includes only one of the fields classificationConfig , textDetectionConfig |
analyzeSpecs[]. features[]. textDetectionConfig. languageCodes[] |
string Required. List of the languages to recognize text. Specified in ISO 639-1 format (for example, The number of elements must be in the range 1-8. The maximum string length in characters for each value is 3. |
analyzeSpecs[]. features[]. textDetectionConfig. model |
string Model to use for text detection. Possible values:
The maximum string length in characters is 50. |
analyzeSpecs[]. mimeType |
string MIME type of content (for example, The maximum string length in characters is 255. |
analyzeSpecs[]. content |
string (byte) analyzeSpecs[] includes only one of the fields content , signature Image content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64. The maximum string length in characters is 10485760. |
analyzeSpecs[]. signature |
string analyzeSpecs[] includes only one of the fields content , signature The maximum string length in characters is 16384. |
folderId | string ID of the folder to which you have access. Required for authorization with a user account (see UserAccount resource). Don't specify this field if you make the request on behalf of a service account. The maximum string length in characters is 50. |
Response
HTTP Code: 200 - OK
{
"results": [
{
"results": [
{
"error": {
"code": "integer",
"message": "string",
"details": [
"object"
]
},
// `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
"textDetection": {
"pages": [
{
"width": "string",
"height": "string",
"blocks": [
{
"boundingBox": {
"vertices": [
{
"x": "string",
"y": "string"
}
]
},
"lines": [
{
"boundingBox": {
"vertices": [
{
"x": "string",
"y": "string"
}
]
},
"words": [
{
"boundingBox": {
"vertices": [
{
"x": "string",
"y": "string"
}
]
},
"text": "string",
"confidence": "number",
"languages": [
{
"languageCode": "string",
"confidence": "number"
}
],
"entityIndex": "string"
}
],
"confidence": "number"
}
]
}
],
"entities": [
{
"name": "string",
"text": "string"
}
]
}
]
},
"classification": {
"properties": [
{
"name": "string",
"probability": "number"
}
]
},
"faceDetection": {
"faces": [
{
"boundingBox": {
"vertices": [
{
"x": "string",
"y": "string"
}
]
}
}
]
},
"imageCopySearch": {
"copyCount": "string",
"topResults": [
{
"imageUrl": "string",
"pageUrl": "string",
"title": "string",
"description": "string"
}
]
},
// end of the list of possible fields`results[].results[]`
}
],
"error": {
"code": "integer",
"message": "string",
"details": [
"object"
]
}
}
]
}
Field | Description |
---|---|
results[] | object Request results. Results have the same order as specifications in the request. |
results[]. results[] |
object Results for each requested feature. Feature results have the same order as in the request. |
results[]. results[]. error |
object Return error in case of error during the specified feature processing. |
results[]. results[]. error. code |
integer (int32) Error code. An enum value of google.rpc.Code. |
results[]. results[]. error. message |
string An error message. |
results[]. results[]. error. details[] |
object A list of messages that carry the error details. |
results[]. results[]. textDetection |
object Text detection (OCR) result. results[].results[] includes only one of the fields textDetection , classification , faceDetection , imageCopySearch |
results[]. results[]. textDetection. pages[] |
object Pages of the recognized file. For JPEG and PNG files contains only 1 page. |
results[]. results[]. textDetection. pages[]. width |
string (int64) Page width in pixels. |
results[]. results[]. textDetection. pages[]. height |
string (int64) Page height in pixels. |
results[]. results[]. textDetection. pages[]. blocks[] |
object Recognized text blocks in this page. |
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox |
object Area on the page where the text block is located. |
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox. vertices[] |
object The bounding polygon vertices. |
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox. vertices[]. x |
string (int64) X coordinate in pixels. |
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox. vertices[]. y |
string (int64) Y coordinate in pixels. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[] |
object Recognized lines in this block. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox |
object Area on the page where the line is located. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox. vertices[] |
object The bounding polygon vertices. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox. vertices[]. x |
string (int64) X coordinate in pixels. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox. vertices[]. y |
string (int64) Y coordinate in pixels. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[] |
object Recognized words in this line. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox |
object Area on the page where the word is located. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox. vertices[] |
object The bounding polygon vertices. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox. vertices[]. x |
string (int64) X coordinate in pixels. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox. vertices[]. y |
string (int64) Y coordinate in pixels. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. text |
string Recognized word value. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. confidence |
number (double) Confidence of the OCR results for the word. Range [0, 1]. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. languages[] |
object A list of detected languages together with confidence. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. languages[]. languageCode |
string Detected language code. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. languages[]. confidence |
number (double) Confidence of detected language. Range [0, 1]. |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. entityIndex |
string (int64) Id of recognized word in entities array |
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. confidence |
number (double) Confidence of the OCR results for the line. Range [0, 1]. |
results[]. results[]. textDetection. pages[]. entities[] |
object Recognized entities |
results[]. results[]. textDetection. pages[]. entities[]. name |
string Entity name |
results[]. results[]. textDetection. pages[]. entities[]. text |
string Recognized entity text |
results[]. results[]. classification |
object Classification result. results[].results[] includes only one of the fields textDetection , classification , faceDetection , imageCopySearch |
results[]. results[]. classification. properties[] |
object Properties extracted by a specified model. For example, if you ask to evaluate the image quality, the service could return such properties as |
results[]. results[]. classification. properties[]. name |
string Property name. |
results[]. results[]. classification. properties[]. probability |
number (double) Probability of the property, from 0 to 1. |
results[]. results[]. faceDetection |
object Face detection result. results[].results[] includes only one of the fields textDetection , classification , faceDetection , imageCopySearch |
results[]. results[]. faceDetection. faces[] |
object An array of detected faces for the specified image. |
results[]. results[]. faceDetection. faces[]. boundingBox |
object Area on the image where the face is located. |
results[]. results[]. faceDetection. faces[]. boundingBox. vertices[] |
object The bounding polygon vertices. |
results[]. results[]. faceDetection. faces[]. boundingBox. vertices[]. x |
string (int64) X coordinate in pixels. |
results[]. results[]. faceDetection. faces[]. boundingBox. vertices[]. y |
string (int64) Y coordinate in pixels. |
results[]. results[]. imageCopySearch |
object Image Copy Search result. results[].results[] includes only one of the fields textDetection , classification , faceDetection , imageCopySearch |
results[]. results[]. imageCopySearch. copyCount |
string (int64) Number of image copies |
results[]. results[]. imageCopySearch. topResults[] |
object Top relevance result of image copy search |
results[]. results[]. imageCopySearch. topResults[]. imageUrl |
string url of image |
results[]. results[]. imageCopySearch. topResults[]. pageUrl |
string url of page that contains image |
results[]. results[]. imageCopySearch. topResults[]. title |
string page title that contains image |
results[]. results[]. imageCopySearch. topResults[]. description |
string image description |
results[]. error |
object Return error in case of error with file processing. The error result of the operation in case of failure or cancellation. |
results[]. error. code |
integer (int32) Error code. An enum value of google.rpc.Code. |
results[]. error. message |
string An error message. |
results[]. error. details[] |
object A list of messages that carry the error details. |