The underlying example is just the one given in the official pyspark documentation.
Please click here to reach this example.
# the first step involves reading the source text file from HDFS
text_file = sc.textFile("hdfs://...")
# this step involves the actual computation for reading...
There are two methods using which you can consume data from AWS S3 bucket.
Using sc.textFile (or sc.wholeTextFiles) API: This api can be used for HDFS and local file system as well.
aws_config = {} # set your aws credential here
sc._jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessK...