[Isis-users] Larger database in ABCD

Eustache Mêgnigbêto eustache.megnigbeto at outlook.com
Sat Apr 18 12:07:25 CEST 2020


 

Dear Egbert

I sent the following message to the list, it was not distributed, perhaps
due to the attachment. I then sent it to your personnal mail, hope you
received it.

 

Regards,

 

 

***

Eustache  Mêgnigbêto

 

 

De : Eustache Mêgnigbêto [mailto:eustache.megnigbeto at outlook.com] 
Envoyé : samedi 18 avril 2020 07:13
À : isis-users at iccisis.org <mailto:isis-users at iccisis.org> 
Objet : RE: [Isis-users] Larger database in ABCD

 

 

Dear Egbert,

 

Records with more than 250 are stored correctly ; they can be edited and
saved in ABCD.

In attachment I send the iso file, the fdt and the fst.

 

PS : Meanwhile, I changed the fst so that an occurrence is indexed only if
its number is less than or equal to 250 ; and the IF is then created and
updated.

 

Best,

 

 

====================================================================

Eustache  Mêgnigbêto

Tél. (+229)  95910242 – (+229) 21147935

09 BP 477 Saint  Michel, Cotonou (République du Bénin)

Google  Scholar :
<https://scholar.google.com/citations?user=xQk_UhwAAAAJ&hl=fr>
https://scholar.google.com/citations?user=xQk_UhwAAAAJ&hl=fr

Web personnel :   <http://eustachem.ilemi.net/> http://eustachem.ilemi.net

Review  activities :
<https://publons.com/researcher/503109/eustache-megnigbeto/peer-review/>
https://publons.com/researcher/503109/eustache-megnigbeto/peer-review/

 

 

 

De : Egbert De Smet [mailto:egbert.desmet at uantwerpen.be] 
Envoyé : vendredi 17 avril 2020 11:23
À : Eustache Mêgnigbêto; isis-users at iccisis.org
<mailto:isis-users at iccisis.org> 
Objet : Re: [Isis-users] Larger database in ABCD

 

Eustache,

 

as explained earlier in other messages on this list, not the number of
occurrences itself is limited, but the total size of these occurrences
filling the max-recordsize of CISIS. That is the main limiting factor and
also the main reason that I wanted ABCD2.x (soon 2.2) to work with other
varieties of CISIS, mainly BigISIS as that one does do incremental indexing
as opposed to FFI.  FFI is more aiming at 'static' databases with larger
records. The max no. of records is in my humble opinion not the crucial
factor : it is still '2 to the power of 24' (due to the setup of XRF),
meaning more than 16 million and enough for most applications. 

Now, back to your concrete problem : I see 2 options, i.e. 

1.	try to avoid such high number of occurrences (repeats of a field) by
moving them to another database and use REF(L(), the semi-relations feature
of ISIS. You could also consider splitting the record over more than one in
the same database and using REF(L() to the database itself ('internal REF')
rather than to another one.
2.	testing the records with BigISIS. However we only have it currently
only in Linux, while we might need to try to re-compile CISIS for BigISIS
now that Windows 64-bits is totally 'normal' (it wasn't at the time).  That
means : records up to 1Mb (but possibly still not enough) but total size of
the database up to 512 Gb.

Can I ask you : do you see such a MFN with high number of occurrences (250
or more ?) to be stored correctly in the MST, only not being possible to get
indexed ? That, i.e. the record itself being not too large but because of
its size not indexable, is possible since for indexing some more temporary
space is needed to store keys etc. in the 'virtual' ISIS-record CISIS always
uses for internal manipulation. Then the problem is not the storage indeed
but the indexing. By the way, in ABCD2.x we use external (and larger)
text-files which still are indexed (for full-text indexing of repositories),
so they are not within the record but referred to when indexing (with the
parameter 'gload='). I doubt this to be a good solution, because still the
virtual record is used while indexing, whether in the end the problem would
be solved by doing this as opposed to having the occurrences stored within
the record, but it could be worth a try.  

But : if the text-files are too large they will be automatically split while
referring with a key to the same 'mother record ID' in the new version of
ABCD2.2. Dumping your 250+ occurrences field into an external text-file with
your basic fields (title, author...) stored as Dublin Core meta-tags
(preserved and stored automatically in each split record), therefore would
also be possible. 

If you send me a couple of such records (in ISO-format with accompanying FDT
and FST), I could give it a try as a third option.

 

 

Egbert de Smet
Universiteit Antwerpen

 

  _____  

From: isis-users <isis-users-bounces+egbert.desmet=ua.ac.be at iccisis.org
<mailto:isis-users-bounces+egbert.desmet=ua.ac.be at iccisis.org> > on behalf
of Eustache Mêgnigbêto <eustache.megnigbeto at outlook.com
<mailto:eustache.megnigbeto at outlook.com> >
Sent: Friday, April 17, 2020 11:10 AM
To: isis-users at iccisis.org <mailto:isis-users at iccisis.org> 
Subject: Re: [Isis-users] Larger database in ABCD 

 

Dear Egbert,

 

I’m using the windows version of CISIS ffi to treat data i’ve downloaded
from the web. Due to the problem with the inverted file key you drew my
attention on, I limited data to import to the isis format database. So, in
fact, the database is no longer « larger » as it should be.

I have 50,000 records and I continue adding new ones. Recenty, I added about
12,000 new records, but while updating the inverted file, I received back
the message error fatal : fullinv/ifload. I don’t understand the meaning of
this error, but I checked, using mx with dict= parameter and found that the
IF was empty. Then I suspected the number of occurrences in one repeatable
field. After checking, I noticed that within the new added records, some
have a number of occurrences over 250.  Is such a number of occurrences the
cause of the problem ?

 

Many thanks for your response 

 

 

====================================================================

Eustache  Mêgnigbêto

Tél. (+229)  95910242 – (+229) 21147935

09 BP 477 Saint  Michel, Cotonou (République du Bénin)

Google  Scholar :
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fscholar.g
oogle.com%2Fcitations%3Fuser%3DxQk_UhwAAAAJ%26hl%3Dfr&data=02%7C01%7Cegbert.
desmet%40uantwerpen.be%7C49366dce79b14435296b08d7e2b1611e%7C792e08fb2d544a8e
af72202548136ef6%7C0%7C0%7C637227124360304967&sdata=u%2BXTRQPVDs2YJyJi19%2Fv
vuf%2Fh5eLCkF%2FDdG7QZS5m%2Bs%3D&reserved=0>
https://scholar.google.com/citations?user=xQk_UhwAAAAJ&hl=fr

Web personnel :
<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Feustachem.
ilemi.net%2F&data=02%7C01%7Cegbert.desmet%40uantwerpen.be%7C49366dce79b14435
296b08d7e2b1611e%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C63722712436030
4967&sdata=4kTwQC1JJ%2FzHJLeW1KQkeMQvTU4QPiEStJqpRxCaL0w%3D&reserved=0>
http://eustachem.ilemi.net

Review  activities :
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpublons.c
om%2Fresearcher%2F503109%2Feustache-megnigbeto%2Fpeer-review%2F&data=02%7C01
%7Cegbert.desmet%40uantwerpen.be%7C49366dce79b14435296b08d7e2b1611e%7C792e08
fb2d544a8eaf72202548136ef6%7C0%7C0%7C637227124360314958&sdata=2s0j7%2FllXMs4
zKAcZ8OuTAfmjcAtwYXpu7sblHCZ678%3D&reserved=0>
https://publons.com/researcher/503109/eustache-megnigbeto/peer-review/

 

 

 

De : Eustache Mêgnigbêto [mailto:eustache.megnigbeto at outlook.com] 
Envoyé : mercredi 10 juillet 2019 09:52
À : Egbert De Smet
Objet : RE: Larger database in ABCD

 

Dea r Egbert,

 

I tried in windows with the ffi version, and it works. I will try under
linux afternoon with the bigisis version.

Thank you very much.

 

Eustache M.

 

From: Egbert De Smet [ <mailto:egbert.desmet at uantwerpen.be>
mailto:egbert.desmet at uantwerpen.be] 
Sent: mercredi 10 juillet 2019 08:45
To: Eustache Mêgnigbêto < <mailto:eustache.megnigbeto at outlook.com>
eustache.megnigbeto at outlook.com>
Subject: Re: Larger database in ABCD

 

Well, obviously the same way : 

CISIS_VERSION = bigisis

 

Egbert de Smet
Universiteit Antwerpen

 

  _____  

From: Eustache Mêgnigbêto < <mailto:eustache.megnigbeto at outlook.com>
eustache.megnigbeto at outlook.com>
Sent: Wednesday, July 10, 2019 9:23 AM
To: Egbert De Smet
Subject: RE: Larger database in ABCD 

 

Dear Egbert,

 

Many thanks,

 

And next how to activate the bigisis under linux ?

 

From: Egbert De Smet [ <mailto:egbert.desmet at uantwerpen.be>
mailto:egbert.desmet at uantwerpen.be] 
Sent: mercredi 10 juillet 2019 08:09
To: Eustache Mêgnigbêto < <mailto:eustache.megnigbeto at outlook.com>
eustache.megnigbeto at outlook.com>;  <mailto:isis-users at iccisis.org>
isis-users at iccisis.org
Subject: Re: Larger database in ABCD

 

In the file dr_path.def of the database concerned (in its 'base'  folder)
put the line : 

CISIS_VERSION=ffi

You will note that with ffi no 'incremental indexing' (one record by one)
will be possible, nor word-proximity searching. Better to use bigisis but
that only works in Linux at this time.

 

Egbert de Smet
Universiteit Antwerpen

 

  _____  

From: isis-users <
<mailto:isis-users-bounces+egbert.desmet=ua.ac.be at iccisis.org>
isis-users-bounces+egbert.desmet=ua.ac.be at iccisis.org> on behalf of Eustache
Mêgnigbêto < <mailto:eustache.megnigbeto at outlook.com>
eustache.megnigbeto at outlook.com>
Sent: Wednesday, July 10, 2019 8:56 AM
To:  <mailto:isis-users at iccisis.org> isis-users at iccisis.org
Subject: [Isis-users] Larger database in ABCD 

 

Dear Egbert,

 

I downloaded some data from the web and converted them to a text delimited
format, then I used the id2i utility to convert to a ISIS database. However,
I noticed that the data were too large to be handled with the standard CISIS
utilities. So I used the FFI version of id2i to convert and mx to read, etc.

Now, I would like to know how to manage such a database with ABCD since the
standard ABCD could not do and since in the ABCD 2.0f version, the subfolder
FFI in the cgi-bin sub folder contains the necessary files ? In other words,
what changes should I do in the cgi-bin folder to be able to operate the
database with ABCD ?

 

Many thank in advance.

 

 

====================================================================

Eustache  Mêgnigbêto

Tél. (+229)  95910242 – (+229) 21147935

09 BP 477 Saint  Michel, Cotonou (République du Bénin)

Google  Scholar :
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fscholar.g
oogle.com%2Fcitations%3Fuser%3DxQk_UhwAAAAJ%26hl%3Dfr&data=02%7C01%7Cegbert.
desmet%40uantwerpen.be%7C49366dce79b14435296b08d7e2b1611e%7C792e08fb2d544a8e
af72202548136ef6%7C0%7C0%7C637227124360314958&sdata=LRhNVE5mZ%2FAml7Oe9atRoh
tiVbt3WAo%2F8FjMK9Y6O08%3D&reserved=0>
https://scholar.google.com/citations?user=xQk_UhwAAAAJ&hl=fr

Web personnel :
<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Feustachem.
ilemi.net%2F&data=02%7C01%7Cegbert.desmet%40uantwerpen.be%7C49366dce79b14435
296b08d7e2b1611e%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C63722712436032
4958&sdata=R%2FWteYbIpVlhGEmg78hFxh%2FTspDSPC7H%2FaoY2flXW90%3D&reserved=0>
http://eustachem.ilemi.net

Review  activities :
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpublons.c
om%2Fresearcher%2F503109%2Feustache-megnigbeto%2Fpeer-review%2F&data=02%7C01
%7Cegbert.desmet%40uantwerpen.be%7C49366dce79b14435296b08d7e2b1611e%7C792e08
fb2d544a8eaf72202548136ef6%7C0%7C0%7C637227124360324958&sdata=IjnZ3eKhoFrI6Y
uHo%2FQHPe0f2FypyXlTAkSbi59Kc9Q%3D&reserved=0>
https://publons.com/researcher/503109/eustache-megnigbeto/peer-review/


 
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpublons.c
om%2Fresearcher%2F503109%2Feustache-megnigbeto%2Fpeer-review%2F&data=02%7C01
%7Cegbert.desmet%40uantwerpen.be%7C49366dce79b14435296b08d7e2b1611e%7C792e08
fb2d544a8eaf72202548136ef6%7C0%7C0%7C637227124360324958&sdata=IjnZ3eKhoFrI6Y
uHo%2FQHPe0f2FypyXlTAkSbi59Kc9Q%3D&reserved=0> 

 
<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpublons.c
om%2Fresearcher%2F503109%2Feustache-megnigbeto%2Fpeer-review%2F&data=02%7C01
%7Cegbert.desmet%40uantwerpen.be%7C49366dce79b14435296b08d7e2b1611e%7C792e08
fb2d544a8eaf72202548136ef6%7C0%7C0%7C637227124360334949&sdata=h6BYCZxWq1VPE7
8LVBd1TbShtZBni7wKBjOqg8F6bQ8%3D&reserved=0> Eustache Megnigbeto | Publons

publons.com

View Eustache Megnigbeto's profile on Publons with 22 publications and 17
reviews.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.iccisis.org/pipermail/isis-users/attachments/20200418/824668f4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1100 bytes
Desc: not available
URL: <http://lists.iccisis.org/pipermail/isis-users/attachments/20200418/824668f4/attachment.jpg>


More information about the isis-users mailing list