fixed conflict with the current upstream master

This commit is contained in:
Gurvinder Singh
2015-07-25 20:57:24 +02:00
838 changed files with 155094 additions and 37292 deletions

View File

@@ -1,3 +1,36 @@
<!-- BEGIN MUNGE: UNVERSIONED_WARNING -->
<!-- BEGIN STRIP_FOR_RELEASE -->
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
width="25" height="25">
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
width="25" height="25">
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
width="25" height="25">
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
width="25" height="25">
<img src="http://kubernetes.io/img/warning.png" alt="WARNING"
width="25" height="25">
<h2>PLEASE NOTE: This document applies to the HEAD of the source tree</h2>
If you are using a released version of Kubernetes, you should
refer to the docs that go with that version.
<strong>
The latest 1.0.x release of this document can be found
[here](http://releases.k8s.io/release-1.0/examples/spark/README.md).
Documentation for other releases can be found at
[releases.k8s.io](http://releases.k8s.io).
</strong>
--
<!-- END STRIP_FOR_RELEASE -->
<!-- END MUNGE: UNVERSIONED_WARNING -->
# Spark example
Following this example, you will create a functional [Apache
@@ -19,34 +52,34 @@ The Docker images are heavily based on https://github.com/mattf/docker-spark
This example assumes you have a Kubernetes cluster installed and
running, and that you have installed the ```kubectl``` command line
tool somewhere in your path. Please see the [getting
started](../../docs/getting-started-guides) for installation
started](../../docs/getting-started-guides/) for installation
instructions for your platform.
## Step One: Start your Master service
The Master [service](../../docs/services.md) is the master (or head) service for a Spark
The Master [service](../../docs/user-guide/services.md) is the master (or head) service for a Spark
cluster.
Use the [`examples/spark/spark-master.json`](spark-master.json) file to create a [pod](../../docs/pods.md) running
Use the [`examples/spark/spark-master.json`](spark-master.json) file to create a [pod](../../docs/user-guide/pods.md) running
the Master service.
```shell
```sh
$ kubectl create -f examples/spark/spark-master.json
```
Then, use the [`examples/spark/spark-master-service.json`](spar-master-service.json) file to
Then, use the [`examples/spark/spark-master-service.json`](spark-master-service.json) file to
create a logical service endpoint that Spark workers can use to access
the Master pod.
```shell
```sh
$ kubectl create -f examples/spark/spark-master-service.json
```
### Check to see if Master is running and accessible
```shell
```sh
$ kubectl get pods
NAME READY REASON RESTARTS AGE
NAME READY STATUS RESTARTS AGE
[...]
spark-master 1/1 Running 0 25s
@@ -54,7 +87,7 @@ spark-master 1/1 Running 0 25
Check logs to see the status of the master.
```shell
```sh
$ kubectl logs spark-master
starting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.4.0-bin-hadoop2.6/sbin/../logs/spark--org.apache.spark.deploy.master.Master-1-spark-master.out
@@ -87,17 +120,17 @@ program.
The Spark workers need the Master service to be running.
Use the [`examples/spark/spark-worker-controller.json`](spark-worker-controller.json) file to create a
[replication controller](../../docs/replication-controller.md) that manages the worker pods.
[replication controller](../../docs/user-guide/replication-controller.md) that manages the worker pods.
```shell
```sh
$ kubectl create -f examples/spark/spark-worker-controller.json
```
### Check to see if the workers are running
```shell
```sh
$ kubectl get pods
NAME READY REASON RESTARTS AGE
NAME READY STATUS RESTARTS AGE
[...]
spark-master 1/1 Running 0 14m
spark-worker-controller-hifwi 1/1 Running 0 33s
@@ -139,7 +172,7 @@ Use the kubectl exec to connect to Spark driver
$ kubectl exec spark-driver -it bash
root@spark-driver:/#
root@spark-driver:/# pyspark
Python 2.7.9 (default, Mar 1 2015, 12:57:24)
Python 2.7.9 (default, Mar 1 2015, 12:57:24)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
15/06/26 14:25:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
@@ -155,11 +188,12 @@ SparkContext available as sc, HiveContext available as sqlContext.
>>> sc.parallelize(range(1000)).map(lambda x:socket.gethostname()).distinct().collect()
['spark-worker-controller-u40r2', 'spark-worker-controller-hifwi', 'spark-worker-controller-vpgyg']
```
## Result
You now have services, replication controllers, and pods for the Spark master , Spark driver and Spark workers.
You can take this example to the next step and start using the Apache Spark cluster
you just created, see [Spark documentation](https://spark.apache.org/documentation.html)
You can take this example to the next step and start using the Apache Spark cluster
you just created, see [Spark documentation](https://spark.apache.org/documentation.html)
for more information.
## tl;dr
@@ -174,4 +208,6 @@ Make sure the Master Pod is running (use: ```kubectl get pods```).
```kubectl create -f spark-driver.json```
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/examples/spark/README.md?pixel)]()
<!-- END MUNGE: GENERATED_ANALYTICS -->