Prefix Search

Asked by Bino

Dear All,

Somedays ago I post a question at pgsql general mailing list.
It's about "prefix"

Suppose I created a database with single table like this :
---------Start---------------

CREATE DATABASE bino;
CREATE TABLE myrecords(record text);
---------End---------------

and I fill myrecords with this :
---------Start---------------
COPY myrecords (record) FROM stdin;
1
12
123
1234
\.
---------End---------------

In my bash script, I have variable called 'vseek', that will be use for query parameter.
How to query the table , for (i.e):

a. If vseek = '127' , I want the result is ==> '12'
b. if vseek = '123987' , I want the result is ==> '123'
c. if vseek = '14789' , I want the result is ==> '1'

I Got two great enlightment from Brian Modora and Harald Fuchs.

Similarity between those 2 solution is that :
1. I have to create pgsql function.
Brian told me to use pl/pgsql, and Harald suggest me (For large database) to use "prefix" a project of pgfoundry by Dimitri Fontaine
2. Special "search" Methode.

Here is their solution

1. Brian's Solution (Using Standard data-type, plus Pl/Pgsql for search)
a. Write a pl/pgsql
---------Start---------------
create or replace function getMatchingRecord(vseek text)
returns text as $$
declare
 str text;
 len integer;
 ret text;
begin
 len := char_length(vseek);
 loop
  exit when len = 0;
  str := substring(vseek from 1 for len);
  select "record" into ret from myrecords where "record" = str;
  if found then
    return ret;
  end if;
  len := len - 1;
 end loop;
end;
$$ language plpgsql;
---------End---------------

b. The Search example
---------Start---------------
KarooDB=> select getMatchingRecord('127');
 getmatchingrecord
-------------------
 12
(1 row)

KarooDB=> select getMatchingRecord('123987');
 getmatchingrecord
-------------------
 123
(1 row)

---------End---------------

2. Harald's Solution (Using Special data type
a. Install "prefix" from http://pgfoundry.org/projects/prefix , so that the database have special data type called "prefix_range"
b. Table and Index create, Use "prefix_range" special data type, And Sample Data Entry
---------Start---------------

CREATE TABLE myrecords (
  record prefix_range NOT NULL,
);
CREATE INDEX myrecords_record_ix ON myrecords USING gist (record);

COPY myrecords (record) FROM stdin;
1
12
123
1234
\.
---------End---------------

c. The Search methode Examples
---------Start---------------
SELECT id, record
FROM myrecords
WHERE record @> '127'
ORDER BY length(record::text) DESC
LIMIT 1;

SELECT id, record
FROM myrecords
WHERE record @> '123987'
ORDER BY length(record::text) DESC
LIMIT 1;

SELECT id, record
FROM myrecords
WHERE record @> '14789'
ORDER BY length(record::text) DESC
LIMIT 1;

---------End---------------

I Tried Both Solution and they all works good .. at least for small numbers of records and direct PGSQL Access

My Question is : How to adopt one of that two solution into OpenErp ?
Note : My current table have about 500,000 records ,

Sincerely
-bino-

Question information

Language:
English Edit question
Status:
Answered
For:
OpenERP Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) said :
#1

Hello Bino,

I would like to ask you what is your aim?

Is it regarding search making easier and based on prefix?

Are you planning to implement better search algorithm?

or

You want to import your 5 Lac records?

Thank you.

Revision history for this message
Bino (bino) said :
#2

Dear Jay, Sir.

Thankyou for your fast respons

My Goal is to :
- Stage one : add phone billing into openerp
- Stage two : Integrate that billing into guest folio of Hospitality
modules of openerp

Condition/Fact :
1. Phone numbers are variable length
2. PBX/PABX of dialed number have no separator, i.e:
a. Calling OpenErp from my office, will produce dialed number of
90013281813700
b. Calling the Whitehouse, will produce dialed number of 900112024561111

I can stripout the first 4 digit, since it's "in my system" (9) Is the
outcall code , and (001) is our IDD provider code.
So it leave : 3281813700 (10 digit) and 12024561111 (11)

To get the cost of each call, I need tobe able to :
- Get a result of "32" (Belgium Country Code) when I put "3281813700" as
search keyword, and
- Get a result of "1" (US country Code) when I put "12024561111" as
search keyword

To make it worse, potentialy there will be multiple operator inside one
country, Multiple landline operator plus multiple cellular operator.
It impact to the cost.

To get clearer explanation of this numbering problem, you can download
http://astbill.com/files/dump/bigcountry.csv.bz2

Sincerely
-bino-

Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) said :
#3

I recmomend to create your own search method.

And yes, OpenERP does not support plsql code/functions/stored procedures yet.

Thank you.

Can you help with this problem?

Provide an answer of your own, or ask Bino for more information if necessary.

To post a message you must log in.