feast_spark.pyspark.launchers.k8s package
Submodules
feast_spark.pyspark.launchers.k8s.k8s module
- class feast_spark.pyspark.launchers.k8s.k8s.KubernetesBatchIngestionJob(api: CustomObjectsApi, namespace: str, job_id: str, feature_table: str)[source]
Bases:
KubernetesJobMixin
,BatchIngestionJob
Ingestion job result for a k8s cluster
- class feast_spark.pyspark.launchers.k8s.k8s.KubernetesJobLauncher(namespace: str, incluster: bool, staging_location: str, generic_resource_template_path: Optional[str], batch_ingestion_resource_template_path: Optional[str], stream_ingestion_resource_template_path: Optional[str], historical_retrieval_resource_template_path: Optional[str], staging_client: AbstractStagingClient)[source]
Bases:
JobLauncher
Submits spark jobs to a spark cluster. Currently supports only historical feature retrieval jobs.
- historical_feature_retrieval(job_params: RetrievalJobParameters) RetrievalJob [source]
Submits a historical feature retrieval job to a Spark cluster.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
wrapper around remote job that returns file uri to the result file.
- Return type
- list_jobs(include_terminated: bool, project: Optional[str] = None, table_name: Optional[str] = None) List[SparkJob] [source]
- offline_to_online_ingestion(ingestion_job_params: BatchIngestionJobParameters) BatchIngestionJob [source]
Submits a batch ingestion job to a Spark cluster.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
wrapper around remote job that can be used to check when job completed.
- Return type
- schedule_offline_to_online_ingestion(ingestion_job_params: ScheduledBatchIngestionJobParameters)[source]
Schedule a batch ingestion job using Spark Operator.
- Raises
SparkJobFailure – Failure to create the ScheduleSparkApplication resource, or timeout.
- Returns
wrapper around remote job that can be used to check the job id.
- Return type
ScheduledBatchIngestionJob
- start_stream_to_online_ingestion(ingestion_job_params: StreamIngestionJobParameters) StreamIngestionJob [source]
Starts a stream ingestion job to a Spark cluster.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
wrapper around remote job.
- Return type
- class feast_spark.pyspark.launchers.k8s.k8s.KubernetesJobMixin(api: CustomObjectsApi, namespace: str, job_id: str)[source]
Bases:
object
- get_status() SparkJobStatus [source]
- class feast_spark.pyspark.launchers.k8s.k8s.KubernetesRetrievalJob(api: CustomObjectsApi, namespace: str, job_id: str, output_file_uri: str)[source]
Bases:
KubernetesJobMixin
,RetrievalJob
Historical feature retrieval job result for a k8s cluster
- get_output_file_uri(timeout_sec=None, block=True)[source]
Get output file uri to the result file. This method will block until the job succeeded, or if the job didn’t execute successfully within timeout.
- Parameters
timeout_sec (int) – Max no of seconds to wait until job is done. If “timeout_sec” is exceeded or if the job fails, an exception will be raised.
block (bool) – If false, don’t block until the job is done. If block=True, timeout parameter is ignored.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
file uri to the result file.
- Return type
str
- class feast_spark.pyspark.launchers.k8s.k8s.KubernetesStreamIngestionJob(api: CustomObjectsApi, namespace: str, job_id: str, job_hash: str, feature_table: str)[source]
Bases:
KubernetesJobMixin
,StreamIngestionJob
Ingestion streaming job for a k8s cluster
feast_spark.pyspark.launchers.k8s.k8s_utils module
- class feast_spark.pyspark.launchers.k8s.k8s_utils.JobInfo(job_id, job_type, job_error_message, namespace, extra_metadata, state, labels, start_time)[source]
Bases:
tuple
- property extra_metadata
Alias for field number 4
- property job_error_message
Alias for field number 2
- property job_id
Alias for field number 0
- property job_type
Alias for field number 1
- property labels
Alias for field number 6
- property namespace
Alias for field number 3
- property start_time
Alias for field number 7
- property state
Alias for field number 5
Module contents
- class feast_spark.pyspark.launchers.k8s.KubernetesBatchIngestionJob(api: CustomObjectsApi, namespace: str, job_id: str, feature_table: str)[source]
Bases:
KubernetesJobMixin
,BatchIngestionJob
Ingestion job result for a k8s cluster
- class feast_spark.pyspark.launchers.k8s.KubernetesJobLauncher(namespace: str, incluster: bool, staging_location: str, generic_resource_template_path: Optional[str], batch_ingestion_resource_template_path: Optional[str], stream_ingestion_resource_template_path: Optional[str], historical_retrieval_resource_template_path: Optional[str], staging_client: AbstractStagingClient)[source]
Bases:
JobLauncher
Submits spark jobs to a spark cluster. Currently supports only historical feature retrieval jobs.
- historical_feature_retrieval(job_params: RetrievalJobParameters) RetrievalJob [source]
Submits a historical feature retrieval job to a Spark cluster.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
wrapper around remote job that returns file uri to the result file.
- Return type
- list_jobs(include_terminated: bool, project: Optional[str] = None, table_name: Optional[str] = None) List[SparkJob] [source]
- offline_to_online_ingestion(ingestion_job_params: BatchIngestionJobParameters) BatchIngestionJob [source]
Submits a batch ingestion job to a Spark cluster.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
wrapper around remote job that can be used to check when job completed.
- Return type
- schedule_offline_to_online_ingestion(ingestion_job_params: ScheduledBatchIngestionJobParameters)[source]
Schedule a batch ingestion job using Spark Operator.
- Raises
SparkJobFailure – Failure to create the ScheduleSparkApplication resource, or timeout.
- Returns
wrapper around remote job that can be used to check the job id.
- Return type
ScheduledBatchIngestionJob
- start_stream_to_online_ingestion(ingestion_job_params: StreamIngestionJobParameters) StreamIngestionJob [source]
Starts a stream ingestion job to a Spark cluster.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
wrapper around remote job.
- Return type
- class feast_spark.pyspark.launchers.k8s.KubernetesRetrievalJob(api: CustomObjectsApi, namespace: str, job_id: str, output_file_uri: str)[source]
Bases:
KubernetesJobMixin
,RetrievalJob
Historical feature retrieval job result for a k8s cluster
- get_output_file_uri(timeout_sec=None, block=True)[source]
Get output file uri to the result file. This method will block until the job succeeded, or if the job didn’t execute successfully within timeout.
- Parameters
timeout_sec (int) – Max no of seconds to wait until job is done. If “timeout_sec” is exceeded or if the job fails, an exception will be raised.
block (bool) – If false, don’t block until the job is done. If block=True, timeout parameter is ignored.
- Raises
SparkJobFailure – The spark job submission failed, encountered error during execution, or timeout.
- Returns
file uri to the result file.
- Return type
str
- class feast_spark.pyspark.launchers.k8s.KubernetesStreamIngestionJob(api: CustomObjectsApi, namespace: str, job_id: str, job_hash: str, feature_table: str)[source]
Bases:
KubernetesJobMixin
,StreamIngestionJob
Ingestion streaming job for a k8s cluster