Yandex Cloud
  • Сервисы
  • Решения
  • Почему Yandex Cloud
  • Сообщество
  • Тарифы
  • Документация
  • Связаться с нами
Подключиться
Language / Region
© 2022 ООО «Яндекс.Облако»
Yandex Data Proc
  • Практические руководства
    • Работа с заданиями
      • Обзор
      • Работа с заданиями Hive
      • Работа с заданиями MapReduce
      • Работа с заданиями PySpark
      • Работа с заданиями Spark
      • Использование Apache Hive
      • Запуск Spark-приложений
      • Запуск приложений с удаленного хоста
    • Настройка сети для кластеров Data Proc
    • Использование Yandex Object Storage в Data Proc
    • Обмен данными с Managed Service for ClickHouse
    • Импорт базы данных с использованием Sqoop
  • Пошаговые инструкции
    • Все инструкции
    • Информация об имеющихся кластерах
    • Создание кластера
    • Подключение к кластеру
    • Изменение кластера
    • Изменение подкластера
    • Управление подкластерами
    • Подключение к интерфейсам компонентов
    • Использование Sqoop
    • Управление заданиями
      • Все задания
      • Задания Spark
      • Задания PySpark
      • Задания Hive
      • Задания MapReduce
    • Удаление кластера
    • Работа с логами
    • Мониторинг состояния кластера и хостов
  • Концепции
    • Обзор Data Proc
    • Классы хостов
    • Среда исполнения
    • Интерфейсы и порты компонентов Data Proc
    • Задания в Data Proc
    • Автоматическое масштабирование
    • Декомиссия подкластеров и хостов
    • Сеть в Data Proc
    • Техническое обслуживание
    • Квоты и лимиты
    • Свойства компонентов
    • Логи в Data Proc
  • Управление доступом
  • Правила тарификации
  • Справочник API
    • Аутентификация в API
    • gRPC (англ.)
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST (англ.)
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • cancel
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • История изменений
    • Изменения сервиса
    • Образы
  • Вопросы и ответы
  1. Справочник API
  2. REST (англ.)
  3. Cluster
  4. create

Method create

Статья создана
Yandex.Cloud
  • HTTP request
  • Body parameters
  • Response

Creates a cluster in the specified folder.

HTTP request

POST https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters

Body parameters

{
  "folderId": "string",
  "name": "string",
  "description": "string",
  "labels": "object",
  "configSpec": {
    "versionId": "string",
    "hadoop": {
      "services": [
        "string"
      ],
      "properties": "object",
      "sshPublicKeys": [
        "string"
      ],
      "initializationActions": [
        {
          "uri": "string",
          "args": [
            "string"
          ],
          "timeout": "string"
        }
      ]
    },
    "subclustersSpec": [
      {
        "name": "string",
        "role": "string",
        "resources": {
          "resourcePresetId": "string",
          "diskTypeId": "string",
          "diskSize": "string"
        },
        "subnetId": "string",
        "hostsCount": "string",
        "assignPublicIp": true,
        "autoscalingConfig": {
          "maxHostsCount": "string",
          "preemptible": true,
          "measurementDuration": "string",
          "warmupDuration": "string",
          "stabilizationDuration": "string",
          "cpuUtilizationTarget": "number",
          "decommissionTimeout": "string"
        }
      }
    ]
  },
  "zoneId": "string",
  "serviceAccountId": "string",
  "bucket": "string",
  "uiProxy": true,
  "securityGroupIds": [
    "string"
  ],
  "hostGroupIds": [
    "string"
  ],
  "deletionProtection": true,
  "logGroupId": "string"
}
Field Description
folderId string

Required. ID of the folder to create a cluster in.

To get a folder ID make a list request.

The maximum string length in characters is 50.

name string

Name of the cluster. The name must be unique within the folder. The name can't be changed after the Data Proc cluster is created.

Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9].

description string

Description of the cluster.

The maximum string length in characters is 256.

labels object

Cluster labels as key:value pairs.

No more than 64 per resource. The string length in characters for each key must be 1-63. Each key must match the regular expression [a-z][-_0-9a-z]*. The maximum string length in characters for each value is 63. Each value must match the regular expression [-_0-9a-z]*.

configSpec object

Required. Configuration and resources for hosts that should be created with the cluster.

configSpec.
versionId
string

Version of the image for cluster provisioning.

All available versions are listed in the documentation.

configSpec.
hadoop
object

Data Proc specific options.

Hadoop configuration that describes services installed in a cluster, their properties and settings.

configSpec.
hadoop.
services[]
string

Set of services used in the cluster (if empty, the default set is used).

configSpec.
hadoop.
properties
object

Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.

For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.

configSpec.
hadoop.
sshPublicKeys[]
string

List of public SSH keys to access to cluster hosts.

configSpec.
hadoop.
initializationActions[]
object

Set of init-actions

configSpec.
hadoop.
initializationActions[].
uri
string

URI of the executable file

configSpec.
hadoop.
initializationActions[].
args[]
string

Arguments to the initialization action

configSpec.
hadoop.
initializationActions[].
timeout
string (int64)

Execution timeout

configSpec.
subclustersSpec[]
object

Specification for creating subclusters.

configSpec.
subclustersSpec[].
name
string

Name of the subcluster.

Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9].

configSpec.
subclustersSpec[].
role
string

Required. Role of the subcluster in the Data Proc cluster.

  • MASTERNODE: The subcluster fulfills the master role.

Master can run the following services, depending on the requested components:

  • HDFS: Namenode, Secondary Namenode
  • YARN: ResourceManager, Timeline Server
  • HBase Master
  • Hive: Server, Metastore, HCatalog
  • Spark History Server
  • Zeppelin
  • ZooKeeper
  • DATANODE: The subcluster is a DATANODE in a Data Proc cluster.

DATANODE can run the following services, depending on the requested components:

  • HDFS DataNode
  • YARN NodeManager
  • HBase RegionServer
  • Spark libraries
  • COMPUTENODE: The subcluster is a COMPUTENODE in a Data Proc cluster.

COMPUTENODE can run the following services, depending on the requested components:

  • YARN NodeManager
  • Spark libraries
configSpec.
subclustersSpec[].
resources
object

Required. Resource configuration for hosts in the subcluster.

configSpec.
subclustersSpec[].
resources.
resourcePresetId
string

ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation.

configSpec.
subclustersSpec[].
resources.
diskTypeId
string

Type of the storage environment for the host. Possible values:

  • network-hdd - network HDD drive,
  • network-ssd - network SSD drive.
configSpec.
subclustersSpec[].
resources.
diskSize
string (int64)

Volume of the storage available to a host, in bytes.

configSpec.
subclustersSpec[].
subnetId
string

Required. ID of the VPC subnet used for hosts in the subcluster.

The maximum string length in characters is 50.

configSpec.
subclustersSpec[].
hostsCount
string (int64)

Required. Number of hosts in the subcluster.

The minimum value is 1.

configSpec.
subclustersSpec[].
assignPublicIp
boolean (boolean)

Assign public ip addresses for all hosts in subcluter.

configSpec.
subclustersSpec[].
autoscalingConfig
object

Configuration for instance group based subclusters

configSpec.
subclustersSpec[].
autoscalingConfig.
maxHostsCount
string (int64)

Upper limit for total instance subcluster count.

Acceptable values are 1 to 100, inclusive.

configSpec.
subclustersSpec[].
autoscalingConfig.
preemptible
boolean (boolean)

Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines.

configSpec.
subclustersSpec[].
autoscalingConfig.
measurementDuration
string

Required. Time in seconds allotted for averaging metrics.

Acceptable values are 60 seconds to 600 seconds, inclusive.

configSpec.
subclustersSpec[].
autoscalingConfig.
warmupDuration
string

The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected.

The maximum value is 600 seconds.

configSpec.
subclustersSpec[].
autoscalingConfig.
stabilizationDuration
string

Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should.

Acceptable values are 60 seconds to 1800 seconds, inclusive.

configSpec.
subclustersSpec[].
autoscalingConfig.
cpuUtilizationTarget
number (double)

Defines an autoscaling rule based on the average CPU utilization of the instance group.

Acceptable values are 10 to 100, inclusive.

configSpec.
subclustersSpec[].
autoscalingConfig.
decommissionTimeout
string (int64)

Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120

Acceptable values are 0 to 86400, inclusive.

zoneId string

Required. ID of the availability zone where the cluster should be placed.

To get the list of available zones make a list request.

The maximum string length in characters is 50.

serviceAccountId string

Required. ID of the service account to be used by the Data Proc manager agent.

bucket string

Name of the Object Storage bucket to use for Data Proc jobs.

uiProxy boolean (boolean)

Enable UI Proxy feature.

securityGroupIds[] string

User security groups.

hostGroupIds[] string

Host groups to place VMs of cluster on.

deletionProtection boolean (boolean)

Deletion Protection inhibits deletion of the cluster

logGroupId string

ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service

Response

HTTP Code: 200 - OK

{
  "id": "string",
  "description": "string",
  "createdAt": "string",
  "createdBy": "string",
  "modifiedAt": "string",
  "done": true,
  "metadata": "object",

  //  includes only one of the fields `error`, `response`
  "error": {
    "code": "integer",
    "message": "string",
    "details": [
      "object"
    ]
  },
  "response": "object",
  // end of the list of possible fields

}

An Operation resource. For more information, see Operation.

Field Description
id string

ID of the operation.

description string

Description of the operation. 0-256 characters long.

createdAt string (date-time)

Creation timestamp.

String in RFC3339 text format.

createdBy string

ID of the user or service account who initiated the operation.

modifiedAt string (date-time)

The time when the Operation resource was last modified.

String in RFC3339 text format.

done boolean (boolean)

If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.

metadata object

Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.

error object
The error result of the operation in case of failure or cancellation.
includes only one of the fields error, response

The error result of the operation in case of failure or cancellation.

error.
code
integer (int32)

Error code. An enum value of google.rpc.Code.

error.
message
string

An error message.

error.
details[]
object

A list of messages that carry the error details.

response object
includes only one of the fields error, response

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is the standard Create/Update, the response should be the target resource of the operation. Any method that returns a long-running operation should document the response type, if any.

Была ли статья полезна?

Language / Region
© 2022 ООО «Яндекс.Облако»
В этой статье:
  • HTTP request
  • Body parameters
  • Response