Nature of ETL and Spark

1) Google Colab

27-Dec-21

Google Colab

2) Cloud vendor space

26-Oct-19

Cloud vendor space

How to upgrade an azure VM? I have a VM in azure. As I was looking at the bill I realized it is not an RI - Reserved instance. With a reserved instance prices can be half to a third. So I wanted to convert this VM to a 3 year reserved instance. By their cost calculator that will come to about 300 dollars for 3 years.

However when I tried to upgrade the VM (They call it resize), I have realized this is an older VM and I am not able to reserve it for 3 years. So this article goes into how to find a suitable VM that can be reserved for 3 years and how do I migrate my OS disk and the data to this new VM

In addition the VHD (hard disk) I had seemed to be a nonmanaged disk. I needed to convert that to a managed disk as well. this article will go into that as well with some useful reads from microsoft azure site.

How is HDFS and HBase different?

5) Hive

15-Jul-19

Hive

6) Apache Pig, Grunt

15-Jul-19

Apache Pig, Grunt

What is clustered resource management and YARN?

8) Sqoop

14-Jul-19

Sqoop

9) Apache Flume

14-Jul-19

Apache Flume

10) Parquet format

6-May-19

Parquet format

11) putty

25-Mar-19

putty

12) First mapreduce program

21-Mar-19

First mapreduce program

13) Mapreduce

21-Mar-19

Mapreduce

14) Cassandra

7-Jan-19

Cassandra

15) Firebase

6-Jan-19

Firebase

16) Couchdb

6-Jan-19

Couchdb

What is the foundational Java API used to write to HDFS?

What is a hadoop edge server or node?

What is a Hadoop Cluster?

How is HDFS fault tolerant?

How do you store a big CSV file in HDFS?

22) Key Bigdata references

23-Nov-18

Key Bigdata references

What is S3? How is it different from HDFS?

25) AVRO - Schemas

16-Aug-17

AVRO - Schemas

26) Cassandra

25-Jul-17

Cassandra

Cloud, Salesforce, Amazon