[Isis-users] Old CDS-ISIS archives

Renate Morgenstern rmorgenstern at iway.na
Tue Apr 12 13:58:57 CEST 2011


Good day,

Should there be a problem getting it into an ISIS database, one could consider putting it in Greenstone Digital Library sofware. It has an email plugin designed for mail archives. It will be fully searchable, and can be configured with different types of classifiers, e.g. Subject, Sender, Subject, etc.
Just a suggestion.
Regards
Renate


----- Original Message -----
From: De Smet Egbert <egbert.desmet at ua.ac.be>
Date: Monday, April 11, 2011 9:23 pm
Subject: Re: [Isis-users] Old CDS-ISIS archives
To: Luciano Ramalho <luciano.ramalho at bireme.org>, "isis-users at iccisis.org" <isis-users at iccisis.org>


> Luciano,
> 
> I have in fact already done so. I run a Perl-script over the texts, 
> which puts the typical mail-tags (e.g. Reply-To: etc.) as 
> 'tagged-text' with the values of the fields. Then I run a text-to-ISIS 
> converter over it to convert in ISO2709. It worked rather well in 
> sample set of the ISIS archives, after I found a way to avoid the 
> quoted e-mails in the body of the message (which are the same tags as 
> the real ones) to be considered as a new e-mail. This was months ago 
> however, so I will now (if I find some time) refresh my memory about 
> it and send you some sample data.
> Then the thing is to get the whole bunch of files (they are many as 
> they are per month for each year at least since 1994) over here and to 
> run the conversion on it.
> If the structure of the messages (I suppose that is what you mean by 
> RFC-2822) is sufficiently constant, we could go for the 
> database-approach (i.e. ABCD), if not we could go for your approach 
> keeping it at a flat-text level with full-text indexing like Google Desktop's.
> The advantage of the database solution of course being one could 
> search on 'From' fields (to do a statistic on who are the most active 
> senders ;-) ) as well as 'Subject' searches on top of the full text of 
> the messages themselves. I remember I succeeded with that with my 
> first tests.
> 
> Egbert de Smet
> Univ. of Antwerp
> 
> ________________________________________
> From: isis-users-bounces at iccisis.org [isis-users-bounces at iccisis.org] 
> on behalf of Luciano Ramalho [luciano.ramalho at bireme.org]
> Sent: Monday, April 11, 2011 10:06 PM
> To: isis-users at iccisis.org
> Subject: Re: [Isis-users] Old CDS-ISIS archives
> 
> In order to convert the mailing list archives to ISIS we need to map
> the e-mail headers, as defined by I to ISIS fields. Also,
> multipart messages, such as those containing HTML or attachments, need
> to have their bodies mapped to several fields.
> 
> I am new to ISIS, so I don't know about previous experiences
> converting mailing list archives to ISIS. Does anyone know of a
> pre-existing mapping between RFC-2822 messages and ISIS records, or
> should we develop one especially for this task?
> 
> Best regards,
> 
> Luciano
> 
> 
> 
> 2011/4/9 Luciano Ramalho <luciano.ramalho at bireme.org>:
> > 2011/4/8 Henk Rutten <hlrutten at xs4all.nl>:
> >> It’s more a matter of lack of time. It should be possible to 
> convert the old
> >> archives, not to Mailman, but for instance to an ABCD database. The 
> only
> >> thing is, that we didn’t have time to do it yet. I’m very sorry!
> >
> > Hello, Henk,
> >
> > I am willing to help preserve and re-publish that archive.
> >
> > My proposal would be to convert each e-mail to an HTML page, with
> > links to previous and next message by date and tables of contents by
> > month. Then it would be just a matter of uploading the HTML to a
> > public site and integrate a Google search box.
> >
> > I'd also investigate organizing the messages by thread, but since that
> > is more complicated I'd initially focus on publishing everything
> > chronologically, and then, after that is online and searchable,
> > evaluate a way to organize by threads as well.
> >
> > Henk, if you can give me access to the archive files I'd immediately
> > put them in a public Web site for anyone to download them in bulk, and
> > start designing the conversion process, keeping the present mailing
> > list informed of the progress. The resulting tools, datasets and files
> > would be shared with all, so that anyone may reuse or republish them.
> >
> > Cheers,
> >
> > Luciano Ramalho
> >
> > PS. This is a personal project that I'd do in my spare time, as a
> > programmer and librarian interested in helping preserve this resource
> > for the history of computers in libraries.
> >
> > 2011/4/8 Henk Rutten <hlrutten at xs4all.nl>:
> >> Dear Renate,
> >>
> >> It’s more a matter of lack of time. It should be possible to 
> convert the old
> >> archives, not to Mailman, but for instance to an ABCD database. The 
> only
> >> thing is, that we didn’t have time to do it yet. I’m very sorry!
> >>
> >> Have a nice day.
> >>
> >> Henk Rutten
> >>
> >> From: Renate Morgenstern [mailto:rmorgenstern at iway.na]
> >> Sent: Friday, April 08, 2011 7:39 AM
> >> To: hlrutten at xs4all.nl
> >> Subject: Old CDS-ISIS archives
> >>
> >> Good day,
> >> I am/was subscribed to the CDS-ISIS list for many years. I 
> sometimes used
> >> the archives when I was looking for help to solve a problem. Was it 
> possible
> >> to get the old archives converted to the Mailman archives?
> >> Thanks and regards
> >> Renate
> >>
> >>
> >> --
> >>
> >> Renate Morgenstern
> >>
> >> P O Box 30664, Windhoek, Namibia
> >>
> >> Tel/Fax: 242124
> >>
> >> Fax to Email: 088637518
> >>
> >> Email: rmorgenstern at iway.na
> >>
> >> _______________________________________________
> >> isis-users mailing list
> >> isis-users at iccisis.org
> >> To manage your own subscription options go to:
> >> http://lists.iccisis.org/listinfo/isis-users
> >> Or contact Henk Rutten: hlrutten at xs4all.nl
> >>
> >>
> >
> >
> >
> > --
> > Luciano Ramalho
> > supervisor de desenvolvimento || software development lead
> > BIREME/OPAS/OMS || BIREME/PAHO/WHO
> >
> 
> 
> 
> --
> Luciano Ramalho
> supervisor de desenvolvimento || software development lead
> BIREME/OPAS/OMS || BIREME/PAHO/WHO
> _______________________________________________
> isis-users mailing list
> isis-users at iccisis.org
> To manage your own subscription options go to: http://lists.iccisis.org/listinfo/isis-users
> Or contact Henk Rutten: hlrutten at xs4all.nl
> _______________________________________________
> isis-users mailing list
> isis-users at iccisis.org
> To manage your own subscription options go to: http://lists.iccisis.org/listinfo/isis-users
> Or contact Henk Rutten: hlrutten at xs4all.nl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rmorgenstern.vcf
Type: text/x-vcard
Size: 226 bytes
Desc: Card for Renate Morgenstern <rmorgenstern at iway.na>
URL: <http://lists.iccisis.org/pipermail/isis-users/attachments/20110412/25a0d091/attachment.vcf>


More information about the isis-users mailing list