Skip to main content


Showing posts from November, 2019

How to test AWS Glue jobs locally / run AWS Glue jobs locally – Part 1

AWS Glue – Local Testing using Apache Spark 2.4.3 Recently, AWS release its glue libs on GitHub AWS Glue GitHub - You can either download Glue 0.9 or Glue 1.0 from the GitHub branch. glue-1.0:  git clone -b glue-1.0 glue 0.9:   git clone Prerequisites: Maven 3.6.0 or higher – Spark 2.2x or higher – Step 1: Install and configure Maven and Apache Spark – configure it as per your installation >> wget >> vi ~/.bashrc export M2

Run AWS Glue Job in PyCharm Community Edition – Part 2

Run AWS Glue Job in PyCharm IDE - Community Edition Step 1: PyCharm  Install PySpark using  >> pip install pyspark==2.4.3 Step 2: Prebuild AWS Glue-1.0 Jar with Python dependencies: >>  Download_Prebuild_Glue_Jar Step 3 : Copy awsglue folder and Jar file into your pycharm project >> Step 4 : Copy python code from my git repository >> Step 5 : Project Structure Step 6:  On console type – Make sure to type your own path >> python com/mypackage/pack/ Step 6 : Any issues comment me here :) In Part 3 , we’ll see more advanced example like AWS Glue-1.0 and Snowflake database.