Vision API, REST: Vision.batchAnalyze

Статья создана

Обновлена 8 сентября 2023 г.

HTTP request
Body parameters
Response

Analyzes a batch of images and returns results with annotations.

HTTP request

POST https://vision.api.cloud.yandex.net/vision/v1/batchAnalyze

Body parameters

{
  "analyzeSpecs": [
    {
      "features": [
        {
          "type": "string",

          // `analyzeSpecs[].features[]` includes only one of the fields `classificationConfig`, `textDetectionConfig`
          "classificationConfig": {
            "model": "string"
          },
          "textDetectionConfig": {
            "languageCodes": [
              "string"
            ],
            "model": "string"
          },
          // end of the list of possible fields`analyzeSpecs[].features[]`

        }
      ],
      "mimeType": "string",

      // `analyzeSpecs[]` includes only one of the fields `content`, `signature`
      "content": "string",
      "signature": "string",
      // end of the list of possible fields`analyzeSpecs[]`

    }
  ],
  "folderId": "string"
}

Field	Description
analyzeSpecs[]	object Required. A list of specifications. Each specification contains the file to analyze and features to use for analysis. Restrictions: Supported file formats: `JPEG`, `PNG`. Maximum file size: 1 MB. Image size should not exceed 20M pixels (length x width). The number of elements must be in the range 1-8.
analyzeSpecs[]. features[]	object Required. Requested features to use for analysis. Max count of requested features for one file is 8. The number of elements must be in the range 1-8.
analyzeSpecs[]. features[]. type	string Type of requested feature. TEXT_DETECTION: Text detection (OCR) feature. CLASSIFICATION: Classification feature. FACE_DETECTION: Face detection feature. IMAGE_COPY_SEARCH: Image copy search.
analyzeSpecs[]. features[]. classificationConfig	object Required for the `CLASSIFICATION` type. Specifies configuration for the classification feature. `analyzeSpecs[].features[]` includes only one of the fields `classificationConfig`, `textDetectionConfig`
analyzeSpecs[]. features[]. classificationConfig. model	string Model to use for image classification. The maximum string length in characters is 256.
analyzeSpecs[]. features[]. textDetectionConfig	object Required for the `TEXT_DETECTION` type. Specifies configuration for the text detection (OCR) feature. `analyzeSpecs[].features[]` includes only one of the fields `classificationConfig`, `textDetectionConfig`
analyzeSpecs[]. features[]. textDetectionConfig. languageCodes[]	string Required. List of the languages to recognize text. Specified in ISO 639-1 format (for example, `ru`). The number of elements must be in the range 1-8. The maximum string length in characters for each value is 3.
analyzeSpecs[]. features[]. textDetectionConfig. model	string Model to use for text detection. Possible values: `page` (default): this model is suitable for detecting multiple text entries in an image. `line`: this model is suitable for cropped images with one line of text. The maximum string length in characters is 50.
analyzeSpecs[]. mimeType	string MIME type of content (for example, `application/pdf`). The maximum string length in characters is 255.
analyzeSpecs[]. content	string (byte) `analyzeSpecs[]` includes only one of the fields `content`, `signature` Image content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64. The maximum string length in characters is 10485760.
analyzeSpecs[]. signature	string `analyzeSpecs[]` includes only one of the fields `content`, `signature` The maximum string length in characters is 16384.
folderId	string ID of the folder to which you have access. Required for authorization with a user account (see UserAccount resource). Don't specify this field if you make the request on behalf of a service account. The maximum string length in characters is 50.

Response

HTTP Code: 200 - OK

{
  "results": [
    {
      "results": [
        {
          "error": {
            "code": "integer",
            "message": "string",
            "details": [
              "object"
            ]
          },

          // `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
          "textDetection": {
            "pages": [
              {
                "width": "string",
                "height": "string",
                "blocks": [
                  {
                    "boundingBox": {
                      "vertices": [
                        {
                          "x": "string",
                          "y": "string"
                        }
                      ]
                    },
                    "lines": [
                      {
                        "boundingBox": {
                          "vertices": [
                            {
                              "x": "string",
                              "y": "string"
                            }
                          ]
                        },
                        "words": [
                          {
                            "boundingBox": {
                              "vertices": [
                                {
                                  "x": "string",
                                  "y": "string"
                                }
                              ]
                            },
                            "text": "string",
                            "confidence": "number",
                            "languages": [
                              {
                                "languageCode": "string",
                                "confidence": "number"
                              }
                            ],
                            "entityIndex": "string"
                          }
                        ],
                        "confidence": "number"
                      }
                    ]
                  }
                ],
                "entities": [
                  {
                    "name": "string",
                    "text": "string"
                  }
                ]
              }
            ]
          },
          "classification": {
            "properties": [
              {
                "name": "string",
                "probability": "number"
              }
            ]
          },
          "faceDetection": {
            "faces": [
              {
                "boundingBox": {
                  "vertices": [
                    {
                      "x": "string",
                      "y": "string"
                    }
                  ]
                }
              }
            ]
          },
          "imageCopySearch": {
            "copyCount": "string",
            "topResults": [
              {
                "imageUrl": "string",
                "pageUrl": "string",
                "title": "string",
                "description": "string"
              }
            ]
          },
          // end of the list of possible fields`results[].results[]`

        }
      ],
      "error": {
        "code": "integer",
        "message": "string",
        "details": [
          "object"
        ]
      }
    }
  ]
}

Field	Description
results[]	object Request results. Results have the same order as specifications in the request.
results[]. results[]	object Results for each requested feature. Feature results have the same order as in the request.
results[]. results[]. error	object Return error in case of error during the specified feature processing.
results[]. results[]. error. code	integer (int32) Error code. An enum value of google.rpc.Code.
results[]. results[]. error. message	string An error message.
results[]. results[]. error. details[]	object A list of messages that carry the error details.
results[]. results[]. textDetection	object Text detection (OCR) result. `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
results[]. results[]. textDetection. pages[]	object Pages of the recognized file. For JPEG and PNG files contains only 1 page.
results[]. results[]. textDetection. pages[]. width	string (int64) Page width in pixels.
results[]. results[]. textDetection. pages[]. height	string (int64) Page height in pixels.
results[]. results[]. textDetection. pages[]. blocks[]	object Recognized text blocks in this page.
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox	object Area on the page where the text block is located.
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox. vertices[]	object The bounding polygon vertices.
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox. vertices[]. x	string (int64) X coordinate in pixels.
results[]. results[]. textDetection. pages[]. blocks[]. boundingBox. vertices[]. y	string (int64) Y coordinate in pixels.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]	object Recognized lines in this block.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox	object Area on the page where the line is located.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox. vertices[]	object The bounding polygon vertices.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox. vertices[]. x	string (int64) X coordinate in pixels.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. boundingBox. vertices[]. y	string (int64) Y coordinate in pixels.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]	object Recognized words in this line.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox	object Area on the page where the word is located.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox. vertices[]	object The bounding polygon vertices.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox. vertices[]. x	string (int64) X coordinate in pixels.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. boundingBox. vertices[]. y	string (int64) Y coordinate in pixels.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. text	string Recognized word value.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. confidence	number (double) Confidence of the OCR results for the word. Range [0, 1].
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. languages[]	object A list of detected languages together with confidence.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. languages[]. languageCode	string Detected language code.
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. languages[]. confidence	number (double) Confidence of detected language. Range [0, 1].
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. words[]. entityIndex	string (int64) Id of recognized word in entities array
results[]. results[]. textDetection. pages[]. blocks[]. lines[]. confidence	number (double) Confidence of the OCR results for the line. Range [0, 1].
results[]. results[]. textDetection. pages[]. entities[]	object Recognized entities
results[]. results[]. textDetection. pages[]. entities[]. name	string Entity name
results[]. results[]. textDetection. pages[]. entities[]. text	string Recognized entity text
results[]. results[]. classification	object Classification result. `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
results[]. results[]. classification. properties[]	object Properties extracted by a specified model. For example, if you ask to evaluate the image quality, the service could return such properties as `good` and `bad`.
results[]. results[]. classification. properties[]. name	string Property name.
results[]. results[]. classification. properties[]. probability	number (double) Probability of the property, from 0 to 1.
results[]. results[]. faceDetection	object Face detection result. `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
results[]. results[]. faceDetection. faces[]	object An array of detected faces for the specified image.
results[]. results[]. faceDetection. faces[]. boundingBox	object Area on the image where the face is located.
results[]. results[]. faceDetection. faces[]. boundingBox. vertices[]	object The bounding polygon vertices.
results[]. results[]. faceDetection. faces[]. boundingBox. vertices[]. x	string (int64) X coordinate in pixels.
results[]. results[]. faceDetection. faces[]. boundingBox. vertices[]. y	string (int64) Y coordinate in pixels.
results[]. results[]. imageCopySearch	object Image Copy Search result. `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
results[]. results[]. imageCopySearch. copyCount	string (int64) Number of image copies
results[]. results[]. imageCopySearch. topResults[]	object Top relevance result of image copy search
results[]. results[]. imageCopySearch. topResults[]. imageUrl	string url of image
results[]. results[]. imageCopySearch. topResults[]. pageUrl	string url of page that contains image
results[]. results[]. imageCopySearch. topResults[]. title	string page title that contains image
results[]. results[]. imageCopySearch. topResults[]. description	string image description
results[]. error	object Return error in case of error with file processing. The error result of the operation in case of failure or cancellation.
results[]. error. code	integer (int32) Error code. An enum value of google.rpc.Code.
results[]. error. message	string An error message.
results[]. error. details[]	object A list of messages that carry the error details.

Vision API, REST: Vision.batchAnalyze

HTTP requestHTTP request

Body parametersBody parameters

ResponseResponse

Была ли статья полезна?

HTTP request

Body parameters

Response