[Isis-users] J-ISIS Release Candidate 1.2
Wenke Adam
wenkeadam at gmail.com
Tue Jun 20 08:04:49 CEST 2017
Dear Hussain,
I have converted a large number of databases from dozens of applications
over the years, and the method I so far have found best, as it gives you
control over every step, is the one described below.
Fangorn is still useful for small datasets, but breaks easily when it
encounters unexpected caracters and doesn´t tell you what´s the problem.
*Converting from csv to isis is easy with the cisis tools*. The mx program
has commands for that. (I don´t have the exact commands here at the moment,
can send them tomorrow if you need).
*Preparation for a clean csv is the most crucial task:*
1. In a copy of the original spreadsheet, edit out any hard linebreaks
in fields (cells) if there are any. Repeatable entries should only be
separated by semicolon space: "item1; item2; item3..."
2. Also check for possible #, @, and | (pipe) signs in the text that mx
could interpret as Isis commands.
3. When the editing is done, use OpenOffice to convert to csv, with
pipes | as field separators.(I don´t see this option working in Excel)
4. Next step, prepare a new Isis database containing fields tags
1,2,3,4,5,6... etc, as many fields as columns in your spreadsheet. Don´t
bother with field names, just call them 1,2,3,4,5, 6...
5. Convert the csv to iso and import into your newbase, and check how
everything looks.
6. When satisfied with the results, use a conversion fst to convert
newbase to the structure of your definitive database and import there.
7. If the records from csv in newbase are cut or seem to put the last
fields of one record at the beginning of the next, there could be some more
original excel/openoffice editing sign causing the problem. Happens f.ex.
with the hard linebreak when contents have been copypasted from word or
other wordprocessor. Or, some times you have to create your newbase with
one more field than there are columns in the spreadsheet.
8. If you find errors like this, don´t fall for the temptation to edit
them in newbase! Go back to your editing spreadsheet, correct there, and
reimport.
Good luck!
Regards
Wenke
2017-06-19 12:24 GMT-04:00 Hussain KH <hussain.rachana at gmail.com>:
> Thanks Jean.
>
> I have already contacted Dr. Prakash. Indeed he has done a marvellous work
> in Tamil.
>
> J-ISIS can do a lot of things in Indian context.
>
> I have data in MS Access in Malayalam script and doing data correction
> now. Apart from MarcEdit is there any other way (like the old Fangorn) to
> convert to isis db from delimited text? Or, have you attempted a method to
> take directly from a csv?
>
> Thanking
>
>
>
> *സസ്നേഹം*
> *- ഹു _______________________*
> *If you optimize everything,*
> *you will always be unhappy. *
> *— DONALD KNUTH*
>
> On Mon, Jun 19, 2017 at 1:12 AM, Jean-Claude Dauphin <jc.dauphin at gmail.com
> > wrote:
>
>> Dear Hussain,
>>
>> Thank you for taking the time to report your findings. I think that
>> J-ISIS will help you to do the job, don't hesitate to ask
>> for help if needed.
>> I recently tested the VIAF data set data which contains several non
>> European content. It was needed to install the MS Arial Unicode font which
>> is not any more provided by Microsoft with Windows 10. I can prepare a step
>> by step instructions on how
>> to download a free MS Arial Unicode Font and how to install it on
>> Windows 10.
>> I think it could be useful to contact Dr Prakash Ira who successfully
>> produced a Digital Library for Tamil Imprints Using J-ISIS.
>> ira.prakash at gmail.com
>>
>> Best wishes
>> Jean-Claude
>>
>>
>> On Sun, Jun 18, 2017 at 7:35 AM, Hussain KH <hussain.rachana at gmail.com>
>> wrote:
>>
>>> Dear Jean-Claude and ISIS Friends
>>>
>>> I've just unzipped jisis_suite.11.June.2017 for windows and just opened.
>>> I haven't yet started working.
>>>
>>> My immediate project is to prepare a print ready copy of bibliography of
>>> Malayalam books published during 2001-2005. It comes around more than 6,000
>>> in 62 subject.
>>> I've been searching for a most suitable package for the purpose. Since
>>> I'm an expert in cds/isis since 1986 and prepared many printed
>>> bibliographies (Bamboo Bibliography in 1990 was an acclaimed one) I know
>>> the capability of isis to do the job. Alas! it is only in English.
>>>
>>> Past three days I was madly searching for a package, reading and going
>>> through many like EndNote, Zotero, Mendely, DB/TextWorks, Jabref, DBs for
>>> Latex, etc. etc. not finding the excellent sorting and formatting
>>> facilities provided by isis.
>>> Very disappointedly I googled "cds isis unicode 2017" and to my great
>>> excitement and luck I came to the latest J-ISIS, thanks to the untiring
>>> pursuit of Jean-Claude.
>>>
>>> Now I'm starting my work, the 9th volume of 'Grandhasoochi', a unique
>>> bibliography in all indian languages with J-ISIS. Though I haven't explored
>>> I'm confident that I can accomplish it.
>>>
>>> I haven't used isis for the last three years after my retirement. Now
>>> I'm recollecting all that I did in 25 years with the great isis. I'll
>>> communicate all my new findings in using Unicode Malayalam. Since I'm a
>>> member in developing Unicode language technology in Malayalam and designed
>>> Unicode fonts based on traditional script (Rachana, Meera, Keraleeya,
>>> Uroob, Tamil Meera- Meera Inimai) I hope I can expose many things related
>>> to Indic scripts.
>>>
>>> May your efforts find outstanding results in unknown countries and
>>> languages.
>>>
>>> Loving and Thanking
>>>
>>>
>>>
>>> *സസ്നേഹം*
>>> *- ഹു _______________________*
>>> *If you optimize everything,*
>>> *you will always be unhappy. *
>>> *— DONALD KNUTH*
>>>
>>> On Tue, Jun 13, 2017 at 5:31 PM, Ernesto Spinak <
>>> ernesto_luis_96 at hotmail.com> wrote:
>>>
>>>> Jean Claude
>>>> Good news, thanks for your effort
>>>> Ernesto Spinak
>>>>
>>>> ________________________________________
>>>> De: isis-users [isis-users-bounces+ernesto_luis_96=
>>>> hotmail.com at iccisis.org] en nombre de Jean-Claude Dauphin [
>>>> jc.dauphin at gmail.com]
>>>> Enviado: domingo, 11 de junio de 2017 16:55
>>>> Para: <isis-users at iccisis.org>; Jean-Claude Dauphin
>>>> Asunto: [Isis-users] J-ISIS Release Candidate 1.2
>>>>
>>>> Dear ISIS Users,
>>>>
>>>> Please find for your consideration the 11 June 2017 Release Candidate
>>>> of J-ISIS. The Release Candidate (RC) is a beta version with potential to
>>>> be a final product, which is ready to release unless significant bugs<
>>>> https://en.wikipedia.org/wiki/Computer_bug> emerge.
>>>> J-ISIS 11 June 2017<https://github.com/J-ISIS
>>>> /J-ISIS/releases/download/v1.2/jisis_suite.11.June.2017.zip>
>>>> The Release Note describes the main Improvements and Bug fixes of
>>>> J-ISIS 11 June 2017 Release Candidate
>>>> J-ISIS 11 June 2017 Release Note<https://github.com/J-ISIS
>>>> /J-ISIS/blob/master/J-ISIS%20release%201-2.pdf>
>>>>
>>>> <https://kenai.com/projects/j-isis/downloads/download/jisis_
>>>> suite%2015%20February%202016%20RC.zip>
>>>> You will find below a summary of the major bug fixes and improvements,
>>>> but please read the release note at it contains more details and screen
>>>> shots.
>>>>
>>>> As usual, I would be very grateful if you could take the time to try
>>>> J-ISIS. All your comments, suggestions, improvement requests and bug
>>>> descriptions are welcome.
>>>>
>>>> Best wishes,
>>>> Jean-Claude
>>>>
>>>> J-ISIS Release Candidate 1.2
>>>>
>>>>
>>>> I. Fixes to the J-ISIS Print Format
>>>>
>>>>
>>>>
>>>> 1) Repeatable literals were not working as expected with field
>>>> dummy selectors (D or N)
>>>> |Hello|d270 was producing an empty string even if field 270 was present
>>>>
>>>>
>>>> 2) Conditional literals with subfield dummy selectors (D or N)
>>>> “Hello”d270^d was always producing Hello as output even if no subfield
>>>> ^d was present
>>>> Same for “Hello”n270^d,
>>>>
>>>> 3) MFN command was raising an error in REF function expressions
>>>> like:
>>>>
>>>> ref(mfn,
>>>>
>>>> if p(v19) and v19^x<='0'then", "d963^i,
>>>>
>>>> (if v19^x<='0'then|<b>|v19^a*2|</b>|,| |v19^b fi)
>>>>
>>>> fi,
>>>>
>>>> )
>>>>
>>>> 4) Extracting a fragment of a Subfield specifying only the offset
>>>> (*offset) was not working
>>>>
>>>> V270^a*2 for example
>>>>
>>>> 5) String function F(expr-1 ,expr-2,expr-3)default width value
>>>>
>>>> 6) String functions S, SS, and CISIS functions LEFT, MID, REPLACE,
>>>> and RIGHT were not working in repeatable group.
>>>>
>>>> For example
>>>>
>>>> (if s(v270^d) <> '1966' then '****' else '1966' fi/)
>>>>
>>>> 7) New Print Format Command for Unconditional Literals <text>
>>>> …</text>
>>>>
>>>> Plain text or most probably HTML formatting can now be imbedded between
>>>> the <text> and </text> tagging commands, it works like unconditional
>>>> literals.
>>>>
>>>> II. Print Format for Repeatable Subfields
>>>>
>>>> Subfield occurrences
>>>>
>>>> It is possible to access individual occurrences of a repeatable
>>>> subfield by specifying the occurrence number or range, enclosed in square
>>>> brackets, immediately following the field selector or field selector
>>>> followed by occurrence selector. For examples:
>>>>
>>>> V270[1]^a[2],v270[1]^a[2]
>>>>
>>>> It is possible to display specific occurrence of a repeatable subfield,
>>>> narrowing the output to one or a range of occurrences of a repeatable
>>>> subfield by specifying the occurrence number or range, enclosed in square
>>>> brackets, immediately following the field selector.
>>>>
>>>> v10^a[1] for example:
>>>>
>>>> It is coded as follows:
>>>>
>>>> [<index> [..<upper index>]]
>>>>
>>>> <index> and <upper index> refer to the first (or unique) and last
>>>> occurrences, respectively. If the specified <index> is greater than the
>>>> actual number of occurrences, no output is generated. The same occurs if
>>>> data subfield is not repeatable and <index> is set to a number equal or
>>>> greater than 2. However, if <index> is set to 1 and it is used in a
>>>> non-repeatable subfield, content is normally output. This component must be
>>>> used outside a repeatable group; otherwise, <upper index> is ignored. If
>>>> double dot (..) is used and <upper index> is missing LAST is assumed. The
>>>> LAST keyword is set with the value of total occurrences of a data subfield.
>>>>
>>>> III. Print Format Global Variables
>>>> Global variables are stored in a virtual ISIS record which is a
>>>> collection of fields, fields may be repeatable and have occurrences, and
>>>> fields or occurrences may have subfields. The record, field and subfield
>>>> concepts are identical to ISIS.
>>>>
>>>> Global variables are referenced by the letter G followed by the tag of
>>>> the field. The G (a mnemonic code for Global variable) followed by the
>>>> virtual record tag is the command telling J-ISIS that you want to assign or
>>>> extract a field. It may be entered indifferently in upper or lower case.
>>>> Global variables can be assigned data through the Print Format commands:
>>>> g100:=((v25/)),(g100^a/)
>>>> g10 := (v10^a)
>>>>
>>>> You may assign or change the value of a global variable as follows:
>>>>
>>>> Gn:=(format) (for example: G5:=(v10)).
>>>>
>>>> Note that the parentheses around format are required.
>>>> Global variables can be extracted for output like V variables just by
>>>> replacing the V by G that means that data will be extracted from the
>>>> virtual record. It supports repeatable groups as well.
>>>>
>>>> Please note that it is a first attempt to implement Global variables
>>>> and that specific functions could also be implemented to further manipulate
>>>> them. Please let me know if it is worth to continue working in this
>>>> direction.
>>>>
>>>> IV. New Paging feature into DB Browser and Terms Dictionary
>>>> Databases could be huge. If a database has millions of records and all
>>>> records are loaded into memory, it will consume a huge amount of memory and
>>>> will of course be very slow. As a matter of facts, user will probably only
>>>> look at 10 or maybe 20 records depending on the viewport size, there is no
>>>> need to download all the records locally. That’s the reason why the paging
>>>> feature was introduced into the DB browser and Terms Dictionary Browser
>>>> modules.
>>>> To make it easy to use the Paging feature, a page navigation toolbar
>>>> provides the interface to do the navigation.
>>>> 10 000 records are loaded per page and the user can scroll easily and
>>>> fast through the page records. For example, the VIAF database has near 32
>>>> million records (31 305 939 records exactly)
>>>>
>>>>
>>>> V. Export features to select search results and using a hit
>>>> file to drive output are now implemented
>>>>
>>>> You can now export records retrieved from search as well as export
>>>> records following the order defined by a hit file produced by the PrintSort
>>>> module
>>>>
>>>> Note: A hit file manager will be developed in the future to better
>>>> manage search hit files and hit sort files
>>>>
>>>>
>>>> VI. The Number of Terms in the index is now stored in an
>>>> external file to avoid the time consuming task of counting them.
>>>>
>>>> The /indexes directory contains a subdirectory called master that
>>>> contains the main index files generated by Lucene open-source search
>>>> software<http://lucene.apache.org/>. A new file named
>>>> “termscount.properties” is now generated by J-ISIS to keep the number of
>>>> terms in the index as well as a time stamp, and is stored in the
>>>> /indexes/master folder. The number of terms in the index is only computed
>>>> when the index has changed and replaced with the new time stamp in the
>>>> external file.
>>>>
>>>>
>>>> For databases with more than 2 millions records, it reduces
>>>> considerably the time spent to get the database information.
>>>>
>>>>
>>>> --
>>>> Jean-Claude Dauphin
>>>>
>>>> jc.dauphin at gmail.com<mailto:jc.dauphin at gmail.com>
>>>>
>>>> https://github.com/J-ISIS<http://kenai.com/projects/j-isis/>
>>>>
>>>> http://www.unesco.org/isis/
>>>> http://www.unesco.org/idams/
>>>> http://www.greenstone.org
>>>> _______________________________________________
>>>> isis-users mailing list
>>>> isis-users at iccisis.org
>>>> To manage your own subscription options go to:
>>>> http://lists.iccisis.org/listinfo/isis-users
>>>> Or contact Henk Rutten: hlrutten at xs4all.nl
>>>>
>>>
>>>
>>
>>
>> --
>> Jean-Claude Dauphin
>>
>> jc.dauphin at gmail.com
>>
>> https://github.com/J-ISIS <http://kenai.com/projects/j-isis/>
>>
>> http://www.unesco.org/isis/
>> http://www.unesco.org/idams/
>> http://www.greenstone.org
>>
>
>
> _______________________________________________
> isis-users mailing list
> isis-users at iccisis.org
> To manage your own subscription options go to: http://lists.iccisis.org/
> listinfo/isis-users
> Or contact Henk Rutten: hlrutten at xs4all.nl
>
>
--
Wenke Adam
Asesora Sistemas de Doc & Inf
Santiago
Chile
Cel: +56-9-890 21 630
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.iccisis.org/pipermail/isis-users/attachments/20170620/d8c3fd06/attachment.html>
More information about the isis-users
mailing list