Adding records in Batch

Asked by ehr on 2008-11-06

Hi,

I have about 2000 records of persons and about 800 counties and a whole number of facilities in a excel file. Is there a way to add these records in a batch way style in one shot? Like export my data to a CSV and important it into the database. I noticed there are a number of tables updated when you add for example a county record and this makes it tricky to just target one table for adding the bulk data. Anybody with the solution on this?

Regards,

Chanda

Question information

Language:
English Edit question
Status:
Solved
For:
iHRIS Manage Edit question
Assignee:
No assignee Edit question
Solved by:
ehr
Solved:
2008-11-12
Last query:
2008-11-12
Last reply:
2008-11-11
Carl Leitner (litlfred) said : #1

Yes, it is a bit of a pain to try to add the data directly to the mysql tables -- this is so the system can track the changes to the data and so that you can dynamically add fields and forms to the system.

We have had to do this a number of times, and I would suggest making use of the forms (look at i2ce/modules/Forms/lib/I2CE_Form to get started). The usual way that we do this is to create a php script that runs through the CSV and creates a new form for each row. It is kind of hard to say exactly what you need to do without looking at your data, but I hope this will give you an idea.

The geographic data is a bit complex, but has to be done before you can add people... you will have to add first all the countries (you may have only one). Then any regions, then the districts, then the counties for that district. You have to make sure that as you go along, you set the appropriate links between the data. For example you could do something like (I haven't tested this script, but it should be a fair enough outline):
       $factory = I2CE_FormFactory::instance();
       $user = new I2CE_User();
       foreach ($geographic_data as $country=>&$country_data) {
            $country_form = $factory->createForm('country');
            $country_form->name = $country;
               //the country name, such as 'United States'
            $country_form->alpha_two = $counry_data['alpha_two'];
                //the two letter alphabetic code for the country. e.g. US
            $country_form->location = $country_data['is_selectable_location'];
               //true or false if the location should be selectable in a drop-down
           $country_form->primary = $country_data['primary'];
              //true or false. true if it is the default selection in a drop-down (only one should be true)
           if (! $country_form->save($user)) {
                  echo "Could not save the country $country\n";
                  continue;
            }
            $country_data['id'] = $country_form->getId();
            $country_form->cleanup();
            //now we need to add in the regions for each country.
            foreach ($country_data['regions'] as $region=>&$region_data) {
                 $region_form = $factory->createForm('region');
                 $region_form->name = $region; //a region name, such as North-east
                 $region_form->country = $country_data['id'];
                 $region_form->code = $region_data['code']; //an optional code for the region, such as NE
                 if (!$region_form->save()) {
                      echo "Could not save the region $region under the country $country\n";
                      continue;
                 }
                 $region_data['id'] = $region_form->getId();
                 $region_form->cleanup();
                 /***
                  * Insert code to handle the districts, and then the counties under the districts
                  * foreach ($region_data['districts'] as $distict=>&$district_data) ... etc
                  */

           }
        }
Note: the cleanup function is so the memory used by creating the form will be return to the system by
the garbage collector.

Now we can add in the people

Suppose you have a 'people.csv' file with each row corresponding to a person. Use whatever you want to read in the
CSV file into the array $people whose values are the row the the CSV file, already split up into the columns.
Suppose the columns are laid out like so:
    0 = firstname
    1 = middle name
    2 = lastname
    3 = nationality
    4 = country of residence
    6 = region of residence
    5 = district of residence
Then you would do (in the same script):
       foreach ($people as $i=>$cols) {
             $person_form = $factory->createForm('person');
             $person_form->firstname = $cols[0];
             $person_form->othername = $cols[1];
             $person_form->surname = $cols[2];
             $person_form->nationality = $geographic_data[$cols[3]]['id']; //get the country id
             $person_form->residence_country = $geographic_data[$cols[4]]['id']; //get the country id
             $person_form->residence_district =
                     $geographic_data[$cols[4]]['regions'][$cols[5]]['districts'][$cols[6]]['id'];
                    //gets the district id
             if (!$person_form->save($user)) {
                   echo "Could not save the person at row $i:\n\t" . implode (',', $cols) . "\n";
            }
            $person_form->cleanup();
      }

If you need so see what fields are available for each form and which field is required for each form, the easiest place is to look in the form browser. For example, look under
        /modules/forms/forms/person
to see what class the person form is associated to (iHRIS_ManagePerson). Then look under
      /modules/forms/formClasses/iHRIS_ManagePerson/fields
to see the available fields and the information about them. You will see under
      /modules/forms/formClasses/iHRIS_ManagePerson/extends
that iHRIS_ManagePerson extends iHRIS_Person, so you can check for more fields under
      /modules/forms/formClasses/iHRIS_Person/fields

Carl Leitner (litlfred) said : #2

I forgot to mention, that the magic data browser, once it has been enabled under "Configure System->Configure Modules", can be accessed under "Configure System->Browse Magic Data"

ehr (chanda-lusakatimes) said : #3

Hi Carl,

Thanks for the answer. Yes I get the idea , thanks. However, am just lost on
where and how to call & install the script. Do I run it on the command line
like

php -a
  // my scripts here
exit ();

or do I create a form that I can call through the browser and the upload my
csv file?

Sorry for asking a lot of newbie questions.

Chanda

On Thu, Nov 6, 2008 at 7:38 PM, Carl Leitner <
<email address hidden>> wrote:

> Your question #50419 on iHRIS Manage changed:
> https://answers.edge.launchpad.net/ihris-manage/+question/50419
>
> Carl Leitner proposed the following answer:
> I forgot to mention, that the magic data browser, once it has been
> enabled under "Configure System->Configure Modules", can be accessed
> under "Configure System->Browse Magic Data"
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
>
> https://answers.edge.launchpad.net/ihris-manage/+question/50419/+confirm?answer_id=1
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https://answers.edge.launchpad.net/ihris-manage/+question/50419
>
> You received this question notification because you are a direct
> subscriber of the question.
>

Carl Leitner (litlfred) said : #4

I would run it from the command line. Are you starting with the "blank" site?
If so, change to the 'manage/sites/blank/pages' directory and look for the
index.php script. Copy it over to something like 'importer.php'. Delete
the lines:
   $page = new Wrangler();
   $page->wrangle();
And put in your import script there. Then, once the script is written, you can
do simply:
   php importer.php

(If you want to do the script piece by piece rather than in one go, you
can do lookups of data that was already entered in. We can talk about
that if you need to)

I'm sorry for the lack of documentation.
cheers,
-carl

Carl Leitner (litlfred) said : #5

I added some documentation for forms, formClasses and how they are related at:
   http://open.intrahealth.org/ihris-docs/form_documentor/
You may also want to check out:
   http://open.intrahealth.org/wiki/index.php/Customizing_iHRIS_Manage
though it may be a bit out of date.

ehr (chanda-lusakatimes) said : #6

 Hi Carl,

Yes am starting with a bank site. Am actually trying to migrate data from
TIMS by JHPIEGO to IHRIS Manage. I have managed to get all the data in a
excel format and am still struggling to import the data into Manage. Please
find attached the two files with sample records am trying to migrate. I have
like 800 facilities and over 2000 persons and I just want to load this base
data in one short

Regards

Boniface

On Fri, Nov 7, 2008 at 2:17 PM, Carl Leitner <
<email address hidden>> wrote:

> Your question #50419 on iHRIS Manage changed:
> https://answers.edge.launchpad.net/ihris-manage/+question/50419
>
> Status: Open => Answered
>
> Carl Leitner proposed the following answer:
> I would run it from the command line. Are you starting with the "blank"
> site?
> If so, change to the 'manage/sites/blank/pages' directory and look for the
> index.php script. Copy it over to something like 'importer.php'. Delete
> the lines:
> $page = new Wrangler();
> $page->wrangle();
> And put in your import script there. Then, once the script is written, you
> can
> do simply:
> php importer.php
>
> (If you want to do the script piece by piece rather than in one go, you
> can do lookups of data that was already entered in. We can talk about
> that if you need to)
>
> I'm sorry for the lack of documentation.
> cheers,
> -carl
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
>
> https://answers.edge.launchpad.net/ihris-manage/+question/50419/+confirm?answer_id=3
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https://answers.edge.launchpad.net/ihris-manage/+question/50419
>
> You received this question notification because you are a direct
> subscriber of the question.
>

Carl Leitner (litlfred) said : #7

For some reason your attachment didn't go through. Can you email me it directly?
<email address hidden>

Cheers,
-carl

Carl Leitner (litlfred) said : #8

Try this script. It worked for me on your data files (exported to CSV) working
off the blank site:

<?php
/*
 * © Copyright 2007, 2008 IntraHealth International, Inc.
 *
 * This File is part of iHRIS
 *
 * iHRIS is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program. If not, see <http://www.gnu.org/licenses/>.
 */
/**
 * @author Carl Leitner <email address hidden>
 * @copyright Copyright &copy; 2007, 2008 IntraHealth International, Inc.
 */

$people_file = 'persons.csv';
$facility_file = 'facilities.csv';

if (!is_readable($people_file)) {
    die("Can't read $people_file\n");
}
if (!is_readable($facility_file)) {
    die("Can't read $facility_file\n");
}

require_once( dirname(__FILE__) . DIRECTORY_SEPARATOR . 'config.values.php');

$local_config = dirname(__FILE__) . DIRECTORY_SEPARATOR .
'local' . DIRECTORY_SEPARATOR . 'config.values.php';
if (file_exists($local_config)) {
    require_once($local_config);
}

if(!isset($i2ce_site_i2ce_path) || !is_dir($i2ce_site_i2ce_path)) {
    echo "Please set the \$i2ce_site_i2ce_path in $local_config";
    exit(55);
}

require_once ($i2ce_site_i2ce_path . DIRECTORY_SEPARATOR . 'I2CE_config.inc.php');

I2CE::initialize($i2ce_site_database_user,
                 $i2ce_site_database_password,
                 $i2ce_site_database,
                 $i2ce_site_user_database,
                 $i2ce_site_module_config
    );

unset($i2ce_site_user_database);
unset($i2ce_site_database);
unset($i2ce_site_database_user);
unset($i2ce_site_database_password);
unset($i2ce_site_i2ce_path);
unset($i2ce_site_module_config);

$factory = I2CE_FormFactory::instance();
$user = new I2CE_User();
$lists = array();

function geo_data($country,$region = null, $district = null, $county = null) {
    if (!is_string($country) || strlen($country) == 0) {
        return 0;
    }
    $country_id = getListId('country',$country);
    if (!is_string($region) || strlen($region) == 0) {
        return $country_id;
    }
    $region_id = getListId('region',$region,'country',$country_id);
    if (!is_string($district) || strlen($district) == 0) {
        return $region_id;
    }
    $district_id = getListId('district',$district, 'region',$region_id);
    if (!is_string($county) || strlen($county) == 0) {
        return $district_id;
    }
    return getListId('county',$county,'district',$district_id);
}

function getListId( $list_name, $value, $top_link = '', $top_link_id = 0, $add_fields = array()) {
    global $lists;
    global $user;
    global $factory;
    if (!array_key_exists($list_name,$lists)) {
        $lists[$list_name] = array();
    }
    $list =&$lists[$list_name];
    if (!array_key_exists($top_link,$list)) {
        $list[$top_link] = array();
    }
    if (!array_key_exists($top_link_id,$list[$top_link])) {
        //get any existing list information in the database.
        $list[$top_link][$top_link_id]['values'] = I2CE_List::listOptions($list_name,$top_link,$top_link_id);
    }
    $values =&$list[$top_link][$top_link_id]['values'];
    $id = array_search($value,$values);
    if ($id !== false) {
        return $id;
    }
    //we did not find the value in the database. we need to add it.
    $form = $factory->createForm($list_name);
    $form->name = $value;
    foreach ($add_fields as $field=>$value) {
        $form->$field = $value;
    }
    if ($top_link_id > 0) {
        $form->setParent($top_link_id);
    }
    if (! $form->save($user)) {
        echo "Could not save the $list_name $value\n";
        return false;
    }
    $id = $form->getId();
    $values[$id] = $value;
    $form->cleanup();
    return $id;
}

// We are supposing you have a 'facility.csv' file with each row corresponding to a facility
// Suppose the columns are laid out like so:
//"Country","Region","District","County","Facility Type","Facility Name","Mailing Address","Telephone","Fax number","Email"
//"South Africa","Eastern Cape","Alfred Nzo","Amahlathi","Clinic","Nqadu Clinic (Nqadu, willowvale",,,"n/a","n/a"
$fh = fopen($facility_file,'r');
if (!is_resource($fh)) {
    die("Bad facilities file $facility_file\n");
}
//skip the first line
fgetcsv($fh);
$i =1;
$facility_types = array();
$skip_facility = null;
while ( ($cols = fgetcsv($fh)) !== false) {
    $i++;
    if (count($cols) != 10) {
        echo "Bad data on line $i:" . implode(',',$cols) . "\n\tHave " . count($cols) . " instead of 10\n";
        continue;
    }
    list($country,$region,$district,$county,$type,$name,$address, $telephone,$fax,$email) = $cols;
    $country_id = geo_data($country);
    $region_id = geo_data($country,$region);
    $district_id = geo_data($country,$region,$district);
    $county_id = geo_data($country,$region,$district,$county);
    $facility_type_id= getListId('facility_type',$type);
    $facility_id = getListId(
        'facility',$name,'',0,
        array('country'=>$country_id,'region'=>$region_id,'district'=>$district_id,'county'=>$county_id,'facility_type'=>$facility_type_id));
    if ($facility_id == 0) {
        echo "Bummer\n";
        continue;
    }
    $facilityForm = $factory->createForm('facility',$facility_id);
    if (!$facilityForm instanceof I2CE_Form) {
        die("Cannot get facility form with id $facility_id\n");
    }
    $existing_contacts = $facilityForm->getChildren('contact');
    if (count($existing_contacts) > 0) {
        $facilityForm->cleanup();
        continue;
    }
    $contactForm = $factory->createForm('contact');
    $contactForm->contact_type = iHRIS_Contact::TYPE_FACILITY;
    $contactForm->address = $address;
    $contactForm->telephone = $telephone;
    $contactForm->fax = $fax;
    $contactForm->email = $email;
    $contactForm->setParent($facility_id);
    if (!$contactForm->save($user)) {
        echo "Could not save the contact at row $i:\n\t" . implode (',', $cols) . "\n";
    }
    $contactForm->cleanup();
    $facilityForm->cleanup();
}

// We are supposing you have a 'people.csv' file with each row corresponding to a person.
// Suppose the columns are laid out like so:
//"FIRST NAME","SURNAME","OTHER NAME","COUNTRY","REGION","DISTRICT","COUNTY"
//"Bonga","Kunene",,"South Africa","Eastern Cape","Alfred Nzo","Umzimkulu"

$fh = fopen($people_file,'r');
if (!is_resource($fh)) {
    die("Bad people file $people_file\n");
}
//skip the first line
fgetcsv($fh);
$i =1;
$skip_person = null;
while ( ($cols = fgetcsv($fh)) !== false) {
    $i++;
    if (count($cols) != 7) {
        echo "Bad data on line $i:" . implode(',',$cols) . "\n\thave " . count($cols) . " instead of 7\n";;
        continue;
    }
    list($firstname, $surname,$othername, $country, $region, $district,$county) = $cols;
    $country_id = geo_data($country);
    $district_id = geo_data($country,$region,$district);
    $county_id = geo_data($country,$region,$district,$county);
    $existing= iHRIS_Person::search(
        array(
            'person'=>array(
                array('field'=>'firstname', 'values'=>array(array('value'=>$firstname))),
                array('field'=>'surname', 'values'=>array(array('value'=>$surname)))
                )
            ));
    if (count($existing) > 0) {
        if (prompt("The person $firstname $surname already exists. Should I skip creation of this person?", $skip_person)) {
            echo "Skipping line " . ($i -1 ) . ": " . implode(",",$cols) . "\n";
            continue;
        }
    }
    $person_form = $factory->createForm('person');
    $person_form->firstname = $firstname;
    $person_form->surname = $surname;
    $person_form->othername = $othername;
    $person_from->nationality = $country_id;
    $person_form->residence_country = $country_id;
    $person_form->residence_district = $district_id;
    $person_form->residence_county = $county_id;

    if (!$person_form->save($user)) {
        echo "Could not save the person at row $i:\n\t" . implode (',', $cols) . "\n";
    }
    $person_form->cleanup();
}

function prompt ($message,&$universal) {
    if ($universal === true) {
        return true;
    }
    if ($universal === false) {
        return false;
    }
    echo "$message: (Yes/No/Always/neVer) ";
    $success =false;
    while (true) {
        $c = strtolower(fread(STDIN,1));
        if ($c=== false) {
            echo ("Bad input");
            die();
        }
        switch ($c) {
        case 'y':
            return true;
        case 'n':
            return false;
        case 'a':
            $universal = true;
            return true;
        case 'v':
            $universal = false;
            return false;
        }
    }
    echo "\n";
}

# Local Variables:
# mode: php
# c-default-style: "bsd"
# indent-tabs-mode: nil
# c-basic-offset: 4
# End:

ehr (chanda-lusakatimes) said : #9

Hi Carl,

Thanks for the script. It worked well despite a few insignificant issues here and there. I also got this error, but it did not affect the import. What could be the issue here

 php importer.php
I2CE: I2CE_MagicData->__addStorage (/var/lib/HRIS/3.1/I2CE/lib/I2CE.php:157): Magic data storage mechanism provided by I2CE_MagicDataStorageAPC is not available

Thanks again for taking time off your busy schedule to help me with this.

Chanda

Carl Leitner (litlfred) said : #10

The "magic data" is stored in the database. Because it is accessed quite heavily on the
webserver, we use APC:
  http://pecl.php.net/package/APC
to cache the results of the database into shared memory which is then shared across
requests. This message is just notifying you that you the APC cache is not available
from the command line. As we are not doing any kind of repeated access to the
database here, you can safely ignore this message. However, if you are seeing it in the
apache log, that means the system is going to run slow, and you'll need to enable the
apc module.

BTW... to monitor for error messages in the apache log file, there is a utility that you
can use to make the messages more legible:
   I2CE/tools/apache_tail.php

-carl

ehr (chanda-lusakatimes) said : #11

Hi Carl,

What is the deal with INT values? When adding simple fields to the form in the xml with code like below, the INT value for the <displayName>The form field type</displayName> does not show up in the form, but STRING_LINE and DATE_YMD works fine. Is there any special treatment the INT needs?

        <configurationGroup name="facility_number">
              <displayName>The field 'facility_used'</displayName>
              <configuration name="formfield">
                <displayName>The form field type</displayName>
                <value>STRING_LINE</value>
              </configuration>
            </configurationGroup>

Chanda

ehr (chanda-lusakatimes) said : #12

Hi Carl,

It now works after adding this to the configuration

----
              <configuration name="in_db" type="boolean">
                <displayName>Store the field in the database</displayName>
                <value>true</value>
              </configuration>
              <configuration name="required" type="boolean">
                <displayName>This field is requried to be set</displayName>
                <value>true</value>
              </configuration>
--

Chanda