Monday, March 12, 2018

Using spss modeler- data type issues

A typical example of data cleansing

There are chances that we encounter issues with the conversion of data types .
A file of type csv need to be given as a source of input to spss modeler for analysis.
Except the first column all other column contains numbers- integers/ float or decimals.
But surely of non character data types.
When the source file is added , the type of data  is displayed as nominal instead of integer.

What to do with this kind of issue ? How to handle it ?

  1.  Check the various options given in the source node- var
  2. There are possibilities that space, tab were checked. Need to decide whether those are required.
  3. Analyse the source data whether any special characters were added. Ex- comma, semi colon, hyphen etc. These create troubles for the modeler to understand the type of data.
  4. If needed delete those without  loosing the context of data or rename it if needed.
  5. These four simple steps  can make the data pure of its form
  6. After keying in the file to the modeler, check whether the required data type is understood by the software. Now it is possible to change/ modify , add dummies, and edit according to our need. 
Learning from the experience  through blogs, forums saves lot of time in the during data modeling.

1 comment:

  1. Nice explanation with examples and easy to understand. Kindly explain extraction, modeling, various schemas,data loading, reporting

    ReplyDelete