2022-02-17 e1d6e8ab5d773f5bd2fadd4520a69999 99+ 2 m 0.3 k

提交Spark任务

1.spark-submit

https://spark.apache.org/docs/latest/submitting-applications.html

The spark-submit script in Spark’s bin directory is used to launch applications on a cluster. It can use all of Spark’s supported cluster managers through a uniform interface so you don’t have to configure your application especially for each one.

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # other options
  <application-jar> \
  [application-arguments]

--class: The entry point for your application (e.g. org.apache.spark.examples.SparkPi)
--master: The master URL for the cluster (e.g. spark://23.195.26.187:7077)
--deploy-mode: Whether to deploy your driver on the worker nodes (cluster) or locally as an external client (client) (default: client) †
--conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown). Multiple configurations should be passed as separate arguments. (e.g. --conf = --conf =)
application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes.
application-arguments: Arguments passed to the main method of your main class, if any

当前为客户端，driver在哪取决于deploy mode

2.python file.py

应该只能local和client

此时若是代码指定cluster会报错

1	config("spark.submit.deployMode", "cluster")

Exception in thread “main” org.apache.spark.SparkException: Cluster deploy mode is not applicable to Spark shells.

3.jupyter notebook

应该只能local和clien

大数据基础组件 spark 使用

提交Spark任务

提交Spark任务

1.spark-submit

2.python file.py

3.jupyter notebook

Recents

Categories

Archives

Tags

Subscribe for updates