Intermediate Python with NumPy, Pandas, SciKit Learn, SciPy, Spark, Streaming & More
This course covers the essentials of using Python as a tool for data scientists to perform exploratory data analysis, complex visualizations, and large-scale distributed processing on “Big Data”. In this course, we cover essential mathematical and statistics libraries such as NumPy, Pandas, SciPy, SciKit-Learn, frameworks like TensorFlow and Spark, as well as visualization tools like matplotlib, PIL, and Seaborn. This course is ‘intermediate level’ as it assumes that attendees have solid data analytics and data science background and have basic Python knowledge. Topics are introductory in nature but are covered in-depth, geared for experienced students.
This course is about 50% hands-on lab to 50% lecture ratio, combining engaging instructor presentations, demos, and practical group discussions with extensive machine-based student labs and project work. Throughout the course, students will learn to write Python scripts and apply them within a scientific framework working with the latest technologies listed on the agenda. This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development.
What You’ll Learn
- Work with Python in a Data Science Context
- Use NumPy, Pandas, and MatPlotLib
- Create and process images with PIL
- Visualize with Seaborn
- Interact with Spark using DataFrames
- Use SparkSQL, MLlib, and Streaming in BigData
Please contact us for a complete course outline.
This course is also available on our public schedule. Contact us for specific dates.
Contact us here.