Yandex Cloud
  • Сервисы
  • Решения
  • Почему Yandex Cloud
  • Сообщество
  • Тарифы
  • Документация
  • Связаться с нами
Подключиться
Language / Region
© 2022 ООО «Яндекс.Облако»
Yandex Data Proc
  • Практические руководства
    • Работа с заданиями
      • Обзор
      • Работа с заданиями Hive
      • Работа с заданиями MapReduce
      • Работа с заданиями PySpark
      • Работа с заданиями Spark
      • Использование Apache Hive
      • Запуск Spark-приложений
      • Запуск приложений с удаленного хоста
    • Настройка сети для кластеров Data Proc
    • Использование Yandex Object Storage в Data Proc
    • Обмен данными с Managed Service for ClickHouse
    • Импорт базы данных с использованием Sqoop
  • Пошаговые инструкции
    • Все инструкции
    • Информация об имеющихся кластерах
    • Создание кластера
    • Подключение к кластеру
    • Изменение кластера
    • Изменение подкластера
    • Управление подкластерами
    • Подключение к интерфейсам компонентов
    • Использование Sqoop
    • Управление заданиями
      • Все задания
      • Задания Spark
      • Задания PySpark
      • Задания Hive
      • Задания MapReduce
    • Удаление кластера
    • Работа с логами
    • Мониторинг состояния кластера и хостов
  • Концепции
    • Обзор Data Proc
    • Классы хостов
    • Среда исполнения
    • Интерфейсы и порты компонентов Data Proc
    • Задания в Data Proc
    • Автоматическое масштабирование
    • Декомиссия подкластеров и хостов
    • Сеть в Data Proc
    • Техническое обслуживание
    • Квоты и лимиты
    • Свойства компонентов
    • Логи в Data Proc
  • Управление доступом
  • Правила тарификации
  • Справочник API
    • Аутентификация в API
    • gRPC (англ.)
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST (англ.)
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • cancel
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • История изменений
    • Изменения сервиса
    • Образы
  • Вопросы и ответы
  1. Справочник API
  2. gRPC (англ.)
  3. ClusterService

ClusterService

Статья создана
Yandex.Cloud
  • Calls ClusterService
  • Get
    • GetClusterRequest
    • Cluster
    • Monitoring
    • ClusterConfig
    • HadoopConfig
    • InitializationAction
  • List
    • ListClustersRequest
    • ListClustersResponse
    • Cluster
    • Monitoring
    • ClusterConfig
    • HadoopConfig
    • InitializationAction
  • Create
    • CreateClusterRequest
    • CreateClusterConfigSpec
    • HadoopConfig
    • InitializationAction
    • CreateSubclusterConfigSpec
    • Resources
    • AutoscalingConfig
    • Operation
    • CreateClusterMetadata
    • Cluster
    • Monitoring
    • ClusterConfig
    • HadoopConfig
    • InitializationAction
  • Update
    • UpdateClusterRequest
    • UpdateClusterConfigSpec
    • UpdateSubclusterConfigSpec
    • Resources
    • AutoscalingConfig
    • HadoopConfig
    • InitializationAction
    • Operation
    • UpdateClusterMetadata
    • Cluster
    • Monitoring
    • ClusterConfig
    • HadoopConfig
    • InitializationAction
  • Delete
    • DeleteClusterRequest
    • Operation
    • DeleteClusterMetadata
  • Start
    • StartClusterRequest
    • Operation
    • StartClusterMetadata
    • Cluster
    • Monitoring
    • ClusterConfig
    • HadoopConfig
    • InitializationAction
  • Stop
    • StopClusterRequest
    • Operation
    • StopClusterMetadata
    • Cluster
    • Monitoring
    • ClusterConfig
    • HadoopConfig
    • InitializationAction
  • ListOperations
    • ListClusterOperationsRequest
    • ListClusterOperationsResponse
    • Operation
  • ListHosts
    • ListClusterHostsRequest
    • ListClusterHostsResponse
    • Host
  • ListUILinks
    • ListUILinksRequest
    • ListUILinksResponse
    • UILink

A set of methods for managing Data Proc clusters.

Call Description
Get Returns the specified cluster.
List Retrieves the list of clusters in the specified folder.
Create Creates a cluster in the specified folder.
Update Updates the configuration of the specified cluster.
Delete Deletes the specified cluster.
Start Starts the specified cluster.
Stop Stops the specified cluster.
ListOperations Lists operations for the specified cluster.
ListHosts Retrieves the list of hosts in the specified cluster.
ListUILinks Retrieves a list of links to web interfaces being proxied by Data Proc UI Proxy.

Calls ClusterService

Get

Returns the specified cluster.
To get the list of all available clusters, make a ClusterService.List request.

rpc Get (GetClusterRequest) returns (Cluster)

GetClusterRequest

Field Description
cluster_id string
Required. ID of the Data Proc cluster.
To get a cluster ID make a ClusterService.List request. The maximum string length in characters is 50.

Cluster

Field Description
id string
ID of the cluster. Generated at creation time.
folder_id string
ID of the folder that the cluster belongs to.
created_at google.protobuf.Timestamp
Creation timestamp.
name string
Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63.
description string
Description of the cluster. The string length in characters must be 0-256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource.
monitoring[] Monitoring
Monitoring systems relevant to the cluster.
config ClusterConfig
Configuration of the cluster.
health enum Health
Aggregated cluster health.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
status enum Status
Cluster status.
  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.
zone_id string
ID of the availability zone where the cluster resides.
service_account_id string
ID of service account for the Data Proc manager agent.
bucket string
Object Storage bucket to be used for Data Proc jobs that are run in the cluster.
ui_proxy bool
Whether UI Proxy feature is enabled.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups hosting VMs of the cluster.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

Monitoring

Field Description
name string
Name of the monitoring system.
description string
Description of the monitoring system.
link string
Link to the monitoring system.

ClusterConfig

Field Description
version_id string
Image version for cluster provisioning. All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific configuration options.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

List

Retrieves the list of clusters in the specified folder.

rpc List (ListClustersRequest) returns (ListClustersResponse)

ListClustersRequest

Field Description
folder_id string
Required. ID of the folder to list clusters in.
To get the folder ID make a yandex.cloud.resourcemanager.v1.FolderService.List request. The maximum string length in characters is 50.
page_size int64
The maximum number of results per page to return. If the number of available results is larger than page_size, the service returns a ListClustersResponse.next_page_token that can be used to get the next page of results in subsequent list requests. Default value: 100. The maximum value is 1000.
page_token string
Page token. To get the next page of results, set page_token to the ListClustersResponse.next_page_token returned by a previous list request. The maximum string length in characters is 100.
filter string
A filter expression that filters clusters listed in the response.
The expression must specify:
  1. The field name. Currently you can use filtering only on Cluster.name field.
  2. An = operator.
  3. The value in double quotes ("). Must be 3-63 characters long and match the regular expression [a-z][-a-z0-9]{1,61}[a-z0-9].
Example of a filter: name=my-cluster. The maximum string length in characters is 1000.

ListClustersResponse

Field Description
clusters[] Cluster
List of clusters in the specified folder.
next_page_token string
Token for getting the next page of the list. If the number of results is greater than the specified ListClustersRequest.page_size, use next_page_token as the value for the ListClustersRequest.page_token parameter in the next list request.
Each subsequent page will have its own next_page_token to continue paging through the results.

Cluster

Field Description
id string
ID of the cluster. Generated at creation time.
folder_id string
ID of the folder that the cluster belongs to.
created_at google.protobuf.Timestamp
Creation timestamp.
name string
Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63.
description string
Description of the cluster. The string length in characters must be 0-256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource.
monitoring[] Monitoring
Monitoring systems relevant to the cluster.
config ClusterConfig
Configuration of the cluster.
health enum Health
Aggregated cluster health.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
status enum Status
Cluster status.
  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.
zone_id string
ID of the availability zone where the cluster resides.
service_account_id string
ID of service account for the Data Proc manager agent.
bucket string
Object Storage bucket to be used for Data Proc jobs that are run in the cluster.
ui_proxy bool
Whether UI Proxy feature is enabled.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups hosting VMs of the cluster.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

Monitoring

Field Description
name string
Name of the monitoring system.
description string
Description of the monitoring system.
link string
Link to the monitoring system.

ClusterConfig

Field Description
version_id string
Image version for cluster provisioning. All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific configuration options.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

Create

Creates a cluster in the specified folder.

rpc Create (CreateClusterRequest) returns (operation.Operation)

Metadata and response of Operation:

    Operation.metadata:CreateClusterMetadata

    Operation.response:Cluster

CreateClusterRequest

Field Description
folder_id string
Required. ID of the folder to create a cluster in.
To get a folder ID make a yandex.cloud.resourcemanager.v1.FolderService.List request. The maximum string length in characters is 50.
name string
Name of the cluster. The name must be unique within the folder. The name can't be changed after the Data Proc cluster is created. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9].
description string
Description of the cluster. The maximum string length in characters is 256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource. The maximum string length in characters for each value is 63. Each value must match the regular expression [-_0-9a-z]*. The string length in characters for each key must be 1-63. Each key must match the regular expression [a-z][-_0-9a-z]*.
config_spec CreateClusterConfigSpec
Required. Configuration and resources for hosts that should be created with the cluster.
zone_id string
Required. ID of the availability zone where the cluster should be placed.
To get the list of available zones make a yandex.cloud.compute.v1.ZoneService.List request. The maximum string length in characters is 50.
service_account_id string
Required. ID of the service account to be used by the Data Proc manager agent.
bucket string
Name of the Object Storage bucket to use for Data Proc jobs.
ui_proxy bool
Enable UI Proxy feature.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups to place VMs of cluster on.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service

CreateClusterConfigSpec

Field Description
version_id string
Version of the image for cluster provisioning.
All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific options.
subclusters_spec[] CreateSubclusterConfigSpec
Specification for creating subclusters.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

CreateSubclusterConfigSpec

Field Description
name string
Name of the subcluster. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9].
role enum Role
Required. Role of the subcluster in the Data Proc cluster.
  • MASTERNODE: The subcluster fulfills the master role.
    Master can run the following services, depending on the requested components:
    • HDFS: Namenode, Secondary Namenode
    • YARN: ResourceManager, Timeline Server
    • HBase Master
    • Hive: Server, Metastore, HCatalog
    • Spark History Server
    • Zeppelin
    • ZooKeeper
  • DATANODE: The subcluster is a DATANODE in a Data Proc cluster.
    DATANODE can run the following services, depending on the requested components:
    • HDFS DataNode
    • YARN NodeManager
    • HBase RegionServer
    • Spark libraries
  • COMPUTENODE: The subcluster is a COMPUTENODE in a Data Proc cluster.
    COMPUTENODE can run the following services, depending on the requested components:
    • YARN NodeManager
    • Spark libraries
resources Resources
Required. Resource configuration for hosts in the subcluster.
subnet_id string
Required. ID of the VPC subnet used for hosts in the subcluster. The maximum string length in characters is 50.
hosts_count int64
Required. Number of hosts in the subcluster. The minimum value is 1.
assign_public_ip bool
Assign public ip addresses for all hosts in subcluter.
autoscaling_config AutoscalingConfig
Configuration for instance group based subclusters

Resources

Field Description
resource_preset_id string
ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation.
disk_type_id string
Type of the storage environment for the host. Possible values:
  • network-hdd - network HDD drive,
  • network-ssd - network SSD drive.
disk_size int64
Volume of the storage available to a host, in bytes.

AutoscalingConfig

Field Description
max_hosts_count int64
Upper limit for total instance subcluster count. Acceptable values are 1 to 100, inclusive.
preemptible bool
Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines.
measurement_duration google.protobuf.Duration
Required. Time in seconds allotted for averaging metrics. Acceptable values are 1m to 10m, inclusive.
warmup_duration google.protobuf.Duration
The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected. The maximum value is 10m.
stabilization_duration google.protobuf.Duration
Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should. Acceptable values are 1m to 30m, inclusive.
cpu_utilization_target double
Defines an autoscaling rule based on the average CPU utilization of the instance group. Acceptable values are 10 to 100, inclusive.
decommission_timeout int64
Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120 Acceptable values are 0 to 86400, inclusive.

Operation

Field Description
id string
ID of the operation.
description string
Description of the operation. 0-256 characters long.
created_at google.protobuf.Timestamp
Creation timestamp.
created_by string
ID of the user or service account who initiated the operation.
modified_at google.protobuf.Timestamp
The time when the Operation resource was last modified.
done bool
If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
metadata google.protobuf.Any<CreateClusterMetadata>
Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
result oneof: error or response
The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true, exactly one of error or response is set.
  error google.rpc.Status
The error result of the operation in case of failure or cancellation.
  response google.protobuf.Any<Cluster>
if operation finished successfully.

CreateClusterMetadata

Field Description
cluster_id string
ID of the cluster that is being created.

Cluster

Field Description
id string
ID of the cluster. Generated at creation time.
folder_id string
ID of the folder that the cluster belongs to.
created_at google.protobuf.Timestamp
Creation timestamp.
name string
Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63.
description string
Description of the cluster. The string length in characters must be 0-256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource.
monitoring[] Monitoring
Monitoring systems relevant to the cluster.
config ClusterConfig
Configuration of the cluster.
health enum Health
Aggregated cluster health.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
status enum Status
Cluster status.
  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.
zone_id string
ID of the availability zone where the cluster resides.
service_account_id string
ID of service account for the Data Proc manager agent.
bucket string
Object Storage bucket to be used for Data Proc jobs that are run in the cluster.
ui_proxy bool
Whether UI Proxy feature is enabled.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups hosting VMs of the cluster.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

Monitoring

Field Description
name string
Name of the monitoring system.
description string
Description of the monitoring system.
link string
Link to the monitoring system.

ClusterConfig

Field Description
version_id string
Image version for cluster provisioning. All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific configuration options.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

Update

Updates the configuration of the specified cluster.

rpc Update (UpdateClusterRequest) returns (operation.Operation)

Metadata and response of Operation:

    Operation.metadata:UpdateClusterMetadata

    Operation.response:Cluster

UpdateClusterRequest

Field Description
cluster_id string
ID of the cluster to update.
To get the cluster ID, make a ClusterService.List request. The maximum string length in characters is 50.
update_mask google.protobuf.FieldMask
Field mask that specifies which attributes of the cluster should be updated.
description string
New description for the cluster. The maximum string length in characters is 256.
labels map<string,string>
A new set of cluster labels as key:value pairs. No more than 64 per resource. The maximum string length in characters for each value is 63. Each value must match the regular expression [-_0-9a-z]*. The string length in characters for each key must be 1-63. Each key must match the regular expression [a-z][-_0-9a-z]*.
config_spec UpdateClusterConfigSpec
Configuration and resources for hosts that should be created with the Data Proc cluster.
name string
New name for the Data Proc cluster. The name must be unique within the folder. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9].
service_account_id string
ID of the new service account to be used by the Data Proc manager agent.
bucket string
Name of the new Object Storage bucket to use for Data Proc jobs.
decommission_timeout int64
Timeout to gracefully decommission nodes. In seconds. Default value: 0 Acceptable values are 0 to 86400, inclusive.
ui_proxy bool
Enable UI Proxy feature.
security_group_ids[] string
User security groups.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service

UpdateClusterConfigSpec

Field Description
subclusters_spec[] UpdateSubclusterConfigSpec
New configuration for subclusters in a cluster.
hadoop HadoopConfig
Hadoop specific options

UpdateSubclusterConfigSpec

Field Description
id string
ID of the subcluster to update.
To get the subcluster ID make a SubclusterService.List request.
name string
Name of the subcluster. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9].
resources Resources
Resource configuration for each host in the subcluster.
hosts_count int64
Number of hosts in the subcluster. The minimum value is 1.
autoscaling_config AutoscalingConfig
Configuration for instance group based subclusters

Resources

Field Description
resource_preset_id string
ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation.
disk_type_id string
Type of the storage environment for the host. Possible values:
  • network-hdd - network HDD drive,
  • network-ssd - network SSD drive.
disk_size int64
Volume of the storage available to a host, in bytes.

AutoscalingConfig

Field Description
max_hosts_count int64
Upper limit for total instance subcluster count. Acceptable values are 1 to 100, inclusive.
preemptible bool
Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines.
measurement_duration google.protobuf.Duration
Required. Time in seconds allotted for averaging metrics. Acceptable values are 1m to 10m, inclusive.
warmup_duration google.protobuf.Duration
The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected. The maximum value is 10m.
stabilization_duration google.protobuf.Duration
Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should. Acceptable values are 1m to 30m, inclusive.
cpu_utilization_target double
Defines an autoscaling rule based on the average CPU utilization of the instance group. Acceptable values are 10 to 100, inclusive.
decommission_timeout int64
Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120 Acceptable values are 0 to 86400, inclusive.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

Operation

Field Description
id string
ID of the operation.
description string
Description of the operation. 0-256 characters long.
created_at google.protobuf.Timestamp
Creation timestamp.
created_by string
ID of the user or service account who initiated the operation.
modified_at google.protobuf.Timestamp
The time when the Operation resource was last modified.
done bool
If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
metadata google.protobuf.Any<UpdateClusterMetadata>
Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
result oneof: error or response
The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true, exactly one of error or response is set.
  error google.rpc.Status
The error result of the operation in case of failure or cancellation.
  response google.protobuf.Any<Cluster>
if operation finished successfully.

UpdateClusterMetadata

Field Description
cluster_id string
ID of the cluster that is being updated.

Cluster

Field Description
id string
ID of the cluster. Generated at creation time.
folder_id string
ID of the folder that the cluster belongs to.
created_at google.protobuf.Timestamp
Creation timestamp.
name string
Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63.
description string
Description of the cluster. The string length in characters must be 0-256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource.
monitoring[] Monitoring
Monitoring systems relevant to the cluster.
config ClusterConfig
Configuration of the cluster.
health enum Health
Aggregated cluster health.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
status enum Status
Cluster status.
  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.
zone_id string
ID of the availability zone where the cluster resides.
service_account_id string
ID of service account for the Data Proc manager agent.
bucket string
Object Storage bucket to be used for Data Proc jobs that are run in the cluster.
ui_proxy bool
Whether UI Proxy feature is enabled.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups hosting VMs of the cluster.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

Monitoring

Field Description
name string
Name of the monitoring system.
description string
Description of the monitoring system.
link string
Link to the monitoring system.

ClusterConfig

Field Description
version_id string
Image version for cluster provisioning. All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific configuration options.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

Delete

Deletes the specified cluster.

rpc Delete (DeleteClusterRequest) returns (operation.Operation)

Metadata and response of Operation:

    Operation.metadata:DeleteClusterMetadata

    Operation.response:google.protobuf.Empty

DeleteClusterRequest

Field Description
cluster_id string
Required. ID of the cluster to delete.
To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50.
decommission_timeout int64
Timeout to gracefully decommission nodes. In seconds. Default value: 0 Acceptable values are 0 to 86400, inclusive.

Operation

Field Description
id string
ID of the operation.
description string
Description of the operation. 0-256 characters long.
created_at google.protobuf.Timestamp
Creation timestamp.
created_by string
ID of the user or service account who initiated the operation.
modified_at google.protobuf.Timestamp
The time when the Operation resource was last modified.
done bool
If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
metadata google.protobuf.Any<DeleteClusterMetadata>
Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
result oneof: error or response
The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true, exactly one of error or response is set.
  error google.rpc.Status
The error result of the operation in case of failure or cancellation.
  response google.protobuf.Any<google.protobuf.Empty>
if operation finished successfully.

DeleteClusterMetadata

Field Description
cluster_id string
ID of the Data Proc cluster that is being deleted.

Start

Starts the specified cluster.

rpc Start (StartClusterRequest) returns (operation.Operation)

Metadata and response of Operation:

    Operation.metadata:StartClusterMetadata

    Operation.response:Cluster

StartClusterRequest

Field Description
cluster_id string
Required. ID of the cluster to start.
To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50.

Operation

Field Description
id string
ID of the operation.
description string
Description of the operation. 0-256 characters long.
created_at google.protobuf.Timestamp
Creation timestamp.
created_by string
ID of the user or service account who initiated the operation.
modified_at google.protobuf.Timestamp
The time when the Operation resource was last modified.
done bool
If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
metadata google.protobuf.Any<StartClusterMetadata>
Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
result oneof: error or response
The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true, exactly one of error or response is set.
  error google.rpc.Status
The error result of the operation in case of failure or cancellation.
  response google.protobuf.Any<Cluster>
if operation finished successfully.

StartClusterMetadata

Field Description
cluster_id string
ID of the Data Proc cluster that is being started.

Cluster

Field Description
id string
ID of the cluster. Generated at creation time.
folder_id string
ID of the folder that the cluster belongs to.
created_at google.protobuf.Timestamp
Creation timestamp.
name string
Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63.
description string
Description of the cluster. The string length in characters must be 0-256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource.
monitoring[] Monitoring
Monitoring systems relevant to the cluster.
config ClusterConfig
Configuration of the cluster.
health enum Health
Aggregated cluster health.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
status enum Status
Cluster status.
  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.
zone_id string
ID of the availability zone where the cluster resides.
service_account_id string
ID of service account for the Data Proc manager agent.
bucket string
Object Storage bucket to be used for Data Proc jobs that are run in the cluster.
ui_proxy bool
Whether UI Proxy feature is enabled.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups hosting VMs of the cluster.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

Monitoring

Field Description
name string
Name of the monitoring system.
description string
Description of the monitoring system.
link string
Link to the monitoring system.

ClusterConfig

Field Description
version_id string
Image version for cluster provisioning. All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific configuration options.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

Stop

Stops the specified cluster.

rpc Stop (StopClusterRequest) returns (operation.Operation)

Metadata and response of Operation:

    Operation.metadata:StopClusterMetadata

    Operation.response:Cluster

StopClusterRequest

Field Description
cluster_id string
Required. ID of the cluster to stop.
To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50.
decommission_timeout int64
Timeout to gracefully decommission nodes. In seconds. Default value: 0 Acceptable values are 0 to 86400, inclusive.

Operation

Field Description
id string
ID of the operation.
description string
Description of the operation. 0-256 characters long.
created_at google.protobuf.Timestamp
Creation timestamp.
created_by string
ID of the user or service account who initiated the operation.
modified_at google.protobuf.Timestamp
The time when the Operation resource was last modified.
done bool
If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
metadata google.protobuf.Any<StopClusterMetadata>
Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
result oneof: error or response
The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true, exactly one of error or response is set.
  error google.rpc.Status
The error result of the operation in case of failure or cancellation.
  response google.protobuf.Any<Cluster>
if operation finished successfully.

StopClusterMetadata

Field Description
cluster_id string
ID of the Data Proc cluster that is being stopped.

Cluster

Field Description
id string
ID of the cluster. Generated at creation time.
folder_id string
ID of the folder that the cluster belongs to.
created_at google.protobuf.Timestamp
Creation timestamp.
name string
Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63.
description string
Description of the cluster. The string length in characters must be 0-256.
labels map<string,string>
Cluster labels as key:value pairs. No more than 64 per resource.
monitoring[] Monitoring
Monitoring systems relevant to the cluster.
config ClusterConfig
Configuration of the cluster.
health enum Health
Aggregated cluster health.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
status enum Status
Cluster status.
  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.
zone_id string
ID of the availability zone where the cluster resides.
service_account_id string
ID of service account for the Data Proc manager agent.
bucket string
Object Storage bucket to be used for Data Proc jobs that are run in the cluster.
ui_proxy bool
Whether UI Proxy feature is enabled.
security_group_ids[] string
User security groups.
host_group_ids[] string
Host groups hosting VMs of the cluster.
deletion_protection bool
Deletion Protection inhibits deletion of the cluster
log_group_id string
ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

Monitoring

Field Description
name string
Name of the monitoring system.
description string
Description of the monitoring system.
link string
Link to the monitoring system.

ClusterConfig

Field Description
version_id string
Image version for cluster provisioning. All available versions are listed in the documentation.
hadoop HadoopConfig
Data Proc specific configuration options.

HadoopConfig

Field Description
services[] enum Service
Set of services used in the cluster (if empty, the default set is used).
properties map<string,string>
Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property.
For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml.
ssh_public_keys[] string
List of public SSH keys to access to cluster hosts.
initialization_actions[] InitializationAction
Set of init-actions

InitializationAction

Field Description
uri string
URI of the executable file
args[] string
Arguments to the initialization action
timeout int64
Execution timeout

ListOperations

Lists operations for the specified cluster.

rpc ListOperations (ListClusterOperationsRequest) returns (ListClusterOperationsResponse)

ListClusterOperationsRequest

Field Description
cluster_id string
Required. ID of the cluster to list operations for. The maximum string length in characters is 50.
page_size int64
The maximum number of results per page to return. If the number of available results is larger than page_size, the service returns a ListClusterOperationsResponse.next_page_token that can be used to get the next page of results in subsequent list requests. Default value: 100. The maximum value is 1000.
page_token string
Page token. To get the next page of results, set page_token to the ListClusterOperationsResponse.next_page_token returned by a previous list request. The maximum string length in characters is 100.

ListClusterOperationsResponse

Field Description
operations[] operation.Operation
List of operations for the specified cluster.
next_page_token string
Token for getting the next page of the list. If the number of results is greater than the specified ListClusterOperationsRequest.page_size, use next_page_token as the value for the ListClusterOperationsRequest.page_token parameter in the next list request.
Each subsequent page will have its own next_page_token to continue paging through the results.

Operation

Field Description
id string
ID of the operation.
description string
Description of the operation. 0-256 characters long.
created_at google.protobuf.Timestamp
Creation timestamp.
created_by string
ID of the user or service account who initiated the operation.
modified_at google.protobuf.Timestamp
The time when the Operation resource was last modified.
done bool
If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.
metadata google.protobuf.Any
Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.
result oneof: error or response
The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true, exactly one of error or response is set.
  error google.rpc.Status
The error result of the operation in case of failure or cancellation.
  response google.protobuf.Any
The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is the standard Create/Update, the response should be the target resource of the operation. Any method that returns a long-running operation should document the response type, if any.

ListHosts

Retrieves the list of hosts in the specified cluster.

rpc ListHosts (ListClusterHostsRequest) returns (ListClusterHostsResponse)

ListClusterHostsRequest

Field Description
cluster_id string
ID of the cluster to list hosts for.
To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50.
page_size int64
The maximum number of results per page to return. If the number of available results is larger than page_size, the service returns a ListClusterHostsResponse.next_page_token that can be used to get the next page of results in subsequent list requests. Default value: 100. The maximum value is 1000.
page_token string
Page token. To get the next page of results, set page_token to the ListClusterHostsResponse.next_page_token returned by a previous list request. The maximum string length in characters is 100.
filter string
A filter expression that filters hosts listed in the response.
The expression must specify:
  1. The field name. Currently you can use filtering only on Cluster.name field.
  2. An = operator.
  3. The value in double quotes ("). Must be 3-63 characters long and match the regular expression [a-z][-a-z0-9]{1,61}[a-z0-9].
Example of a filter: name=my-host The maximum string length in characters is 1000.

ListClusterHostsResponse

Field Description
hosts[] Host
Requested list of hosts.
next_page_token string
Token for getting the next page of the list. If the number of results is greater than the specified ListClusterHostsRequest.page_size, use next_page_token as the value for the ListClusterHostsRequest.page_token parameter in the next list request.
Each subsequent page will have its own next_page_token to continue paging through the results.

Host

Field Description
name string
Name of the Data Proc host. The host name is assigned by Data Proc at creation time and cannot be changed. The name is generated to be unique across all existing Data Proc hosts in Yandex Cloud, as it defines the FQDN of the host.
subcluster_id string
ID of the Data Proc subcluster that the host belongs to.
health enum Health
Status code of the aggregated health of the host.
  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).
compute_instance_id string
ID of the Compute virtual machine that is used as the Data Proc host.
role enum Role
Role of the host in the cluster.
  • MASTERNODE: The subcluster fulfills the master role.
    Master can run the following services, depending on the requested components:
    • HDFS: Namenode, Secondary Namenode
    • YARN: ResourceManager, Timeline Server
    • HBase Master
    • Hive: Server, Metastore, HCatalog
    • Spark History Server
    • Zeppelin
    • ZooKeeper
  • DATANODE: The subcluster is a DATANODE in a Data Proc cluster.
    DATANODE can run the following services, depending on the requested components:
    • HDFS DataNode
    • YARN NodeManager
    • HBase RegionServer
    • Spark libraries
  • COMPUTENODE: The subcluster is a COMPUTENODE in a Data Proc cluster.
    COMPUTENODE can run the following services, depending on the requested components:
    • YARN NodeManager
    • Spark libraries

ListUILinks

Retrieves a list of links to web interfaces being proxied by Data Proc UI Proxy.

rpc ListUILinks (ListUILinksRequest) returns (ListUILinksResponse)

ListUILinksRequest

Field Description
cluster_id string
Required. ID of the Hadoop cluster. The maximum string length in characters is 50.

ListUILinksResponse

Field Description
links[] UILink
Requested list of ui links.

UILink

Field Description
name string
url string

Была ли статья полезна?

Language / Region
© 2022 ООО «Яндекс.Облако»
В этой статье:
  • Calls ClusterService
  • Get
  • GetClusterRequest
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  • List
  • ListClustersRequest
  • ListClustersResponse
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  • Create
  • CreateClusterRequest
  • CreateClusterConfigSpec
  • HadoopConfig
  • InitializationAction
  • CreateSubclusterConfigSpec
  • Resources
  • AutoscalingConfig
  • Operation
  • CreateClusterMetadata
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  • Update
  • UpdateClusterRequest
  • UpdateClusterConfigSpec
  • UpdateSubclusterConfigSpec
  • Resources
  • AutoscalingConfig
  • HadoopConfig
  • InitializationAction
  • Operation
  • UpdateClusterMetadata
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  • Delete
  • DeleteClusterRequest
  • Operation
  • DeleteClusterMetadata
  • Start
  • StartClusterRequest
  • Operation
  • StartClusterMetadata
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  • Stop
  • StopClusterRequest
  • Operation
  • StopClusterMetadata
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  • ListOperations
  • ListClusterOperationsRequest
  • ListClusterOperationsResponse
  • Operation
  • ListHosts
  • ListClusterHostsRequest
  • ListClusterHostsResponse
  • Host
  • ListUILinks
  • ListUILinksRequest
  • ListUILinksResponse
  • UILink