Tuesday, April 16, 2019

How Excel can put you in trouble...

I have a Large data , the head of it will look like the one as given below
2.33546E+11
ee
21-04-2017 13:04
ACTIVE
Pop
4
2017
2.33246E+11
gg
21-04-2017 15:43
ACTIVE
SMS
4
2017
2.33245E+11
hh
22-04-2017 21:52
ACTIVE
Intg
4
2017
2.33244E+11
jj
22-04-2017 22:09
ACTIVE
Intg
4
2017

My file did not  have headers. I have 10 such files, need to perform analysis  and get some insights.

The very first step I did was to insert a row and gave the names of the headers .
mobile_num
keyword
Date and time
status_name
mode_name
month
year
2.33546E+11
ee
21-04-2017 13:04
ACTIVE
Pop
4
2017

Saved the file in the csv format. Used R to read the file . It gave the summary . Every thing went fine.
I was about to compare the column of mobile_num of one file with another, it gave unexpected  output  which put me in lot of trouble.
What went wrong ?
After several attempts to check my code, test my logic, I was not convinced that my R prog has malfunctioned. Finally decided to see the data what R has read. The mobile numbers were read as 6 numbers followed by zeros as given in the exponential value. Truly unimaginable to the core!! how did it happen.  I don't know. But I have figured out what should have been done to avoid this.

1. While saving a column with exponential number / a 12 digit number it has to be formatted as number with no decimal places.
2. Insert a row above
3. paste the header names
4. save it and re open for confirmation.

You could save atleast  your own time if you know the above.

No comments:

Post a Comment