xslt, citeproc-writer

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

xslt, citeproc-writer

Bruce D'Arcus
bib project questions ...

Re: David Wilson's idea that citeproc give pre-rendered citation and
bibliography chunks (first/subsequent, etc.) and save it in the XML,
described here:

<http://wiki.services.openoffice.org/wiki/Citeproc_Writer_Interaction>

I've thought about this some, and agree with the first part, but not
that the rendered content should be saved in the XML. I think perhaps
we can modify the bibliographic class to store that pre-formatted
content (or create a new ReferenceList class?), so that it's just
stored in memory, rather than saved?

I've updated the wiki to reflect this.

So process is something like:

Citation passes list of ids to ReferenceList
ReferenceList requests formatted citation and bib chunks from citeproc
Citation requests formatted citation from ReferenceList
Bibliography = ReferenceList to ODF

I think this is how MS is doing it in Word 2007.

Right now citeproc is XSLT 2.0. It'd be nice if we could just use it
more-or-less as is. Svante has suggested it's likely OOo might switch
to using Saxon (and thus get XSLT 2.0 for free) in the next major
release.

How feasible would it be to this? Could we implement essentially
real-time citation processing using XSLT?

It's hard enough to get good C++ programmers, and I'd rather not have
them waste time reimplementing citeproc in that language when it
already works quite well.

Bruce

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: xslt, citeproc-writer

ptsefton
I disagree Bruce, about not storing rendered content in the XML. I think it
needs to be stored in a rendered form.

If you don't then it will make it very hard to write things like
OpenDocument to HTML transforms in random languages as you would need to run
citeproc to format citations and bibliographies.

Related to this is an interoperability question. It is important not to
focus only on interop with OOo2, after all with a free software package it
is easy to get users to upgrade. Consider the interop problems with MS Word.

Have you considered an approach where citations are stored as rendered text
(or footnote/endnotes) in place with a link of some kind back to the
bibliographic database with the citation details stored as an item in the
database. That is you would have a (1) Work, with (2) particular expression
(is that what you call it) with (3) a citation by page or line number or
whatever . That is three items in the database - and only one simple link in
the documnet text. Seems to me that this would fit very well with your RDF
approach, Bruce. And an approach like this might mean that you could build a
solution that could wodk with OpenXML docs as well.





On 6/28/06, Bruce D'Arcus <[hidden email]> wrote:

>
> bib project questions ...
>
> Re: David Wilson's idea that citeproc give pre-rendered citation and
> bibliography chunks (first/subsequent, etc.) and save it in the XML,
> described here:
>
> <http://wiki.services.openoffice.org/wiki/Citeproc_Writer_Interaction>
>
> I've thought about this some, and agree with the first part, but not
> that the rendered content should be saved in the XML. I think perhaps
> we can modify the bibliographic class to store that pre-formatted
> content (or create a new ReferenceList class?), so that it's just
> stored in memory, rather than saved?




I've updated the wiki to reflect this.

>
> So process is something like:
>
> Citation passes list of ids to ReferenceList
> ReferenceList requests formatted citation and bib chunks from citeproc
> Citation requests formatted citation from ReferenceList
> Bibliography = ReferenceList to ODF
>
> I think this is how MS is doing it in Word 2007.
>
> Right now citeproc is XSLT 2.0. It'd be nice if we could just use it
> more-or-less as is. Svante has suggested it's likely OOo might switch
> to using Saxon (and thus get XSLT 2.0 for free) in the next major
> release.
>
> How feasible would it be to this? Could we implement essentially
> real-time citation processing using XSLT?
>
> It's hard enough to get good C++ programmers, and I'd rather not have
> them waste time reimplementing citeproc in that language when it
> already works quite well.
>
> Bruce
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Peter (pt) Sefton
Toowoomba 4350
Queensland, Australia
Phone: +61 4 1032 6955
Web: http://ptsefton.com
Email: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: xslt, citeproc-writer

Bruce D'Arcus

Hey Peter,

On Jun 27, 2006, at 11:09 PM, pt wrote:

> I disagree Bruce, about not storing rendered content in the XML. I
> think it
> needs to be stored in a rendered form.
>
> If you don't then it will make it very hard to write things like
> OpenDocument to HTML transforms in random languages as you would need
> to run
> citeproc to format citations and bibliographies.

Maybe I wasn't clear, but the citation always gets included in the
content file; there's no other way to display it after all. And that
can be easily transformed to HTML or OXML.

What David was suggesting (if I understood right) was that the
bibliographic source file (bibliography.xml or whatever) would, beyond
the raw metadata, also include pre-rendered chunks for all potential
citation rendering options for a given style, plus the bibliographic
entry.

My problem with that is it results in redundant and unnecessary
content, and pollutes the source file.

> Related to this is an interoperability question. It is important not to
> focus only on interop with OOo2, after all with a free software
> package it
> is easy to get users to upgrade. Consider the interop problems with MS
> Word.

Given what I say above, do you still see any interop problems?

> Have you considered an approach where citations are stored as rendered
> text
> (or footnote/endnotes) in place with a link of some kind back to the
> bibliographic database with the citation details stored as an item in
> the
> database.

That's exactly what the new ODF approach (and the MS OXML approach)
does ;-)

> That is you would have a (1) Work, with (2) particular expression
> (is that what you call it) with (3) a citation by page or line number
> or
> whatever . That is three items in the database - and only one simple
> link in
> the documnet text. Seems to me that this would fit very well with your
> RDF
> approach, Bruce. And an approach like this might mean that you could
> build a
> solution that could wodk with OpenXML docs as well.

Am not quite following this bit. The plan is:

1)  content.xml holds the new citation fields, which are:
        a) link to a source record, and
        b) rendered citation

2)  the source metadata gets stored in a dedicated file within the
wrapper; maybe bibliography/source.xml

1b gets generated from 2. This is exactly how MS is doing it,
coincidentally, in OXML.

What David was thinking about was funky citation styles (well, many of
them, in fact; APA, Chicago, etc.) that distinguish first and
subsequent citations. The way citeproc works now is, IT has to figure
out this sort of positional information, and then inserts the right
formatted version in the output.

The alternative, then, is to just have citeproc be rather dumb about
it, and create the two representations for each citation, and have the
new citation support figured out which to use.

Bruce

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: xslt, citeproc-writer

ptsefton
Thanks for the propmt response Bruce.

What I'm suggesting for consideration is that you take this example from
http://wiki.services.openoffice.org/wiki/Bibliographic_Project%27s_Developer_Page#New_Citation_Coding


<cite:biblioref cite:key="Veer1996a">
     <cite:detail cite:units="pages" cite:begin="23" cite:end="24"/>
   </cite:biblioref>

And change it to something like this:

<cite:biblioref cite:key="Veer1996a-citation1">

   </cite:biblioref>

Where there the page-range details are stored in the bibliographic database
as a new item that relates to the key 'Veer1996a' - so each page-range would
have its own record in the db.

(I'm assuming the rendered text will go inside the cite element?)

I'm still of the opinion that you would be better off building an
bibliographic database with the simplest possible hooks into the file
format; this could be implemented using a cross reference field or some
other field that exists already in OpenDocument.

What if the OOo bibliographic tool had its own web server? Then the inline
text would look like:

<a href="http://localhost:1234/cite-key/Veer1996a/1/">(Veer 1996a pp.
23-24)</a>

Where the database would have a record for 'Veer1996a' and a record for each
page-range cited.

Citeproc could change the rendered citation inside the <a> element
dynamically as has been proposed elsewhere.

I am suggesting these ideas because I think they would allow an integrated
tool in OOo to also inter operate  with Word documents, HTML documents and
and so in the kinds of real environments we see at our university where we
have Windows, Mac and Linux users running different versions of Word and OOo
and LaTeX. To work with .doc files one would have to ship the database
separately like you do with EndNote now, but it could be embedded in the
file where OpenDocument 3 support is available.




On 6/28/06, Bruce D'Arcus <[hidden email]> wrote:

>
>
> Hey Peter,
>
> On Jun 27, 2006, at 11:09 PM, pt wrote:
>
> > I disagree Bruce, about not storing rendered content in the XML. I
> > think it
> > needs to be stored in a rendered form.
> >
> > If you don't then it will make it very hard to write things like
> > OpenDocument to HTML transforms in random languages as you would need
> > to run
> > citeproc to format citations and bibliographies.
>
> Maybe I wasn't clear, but the citation always gets included in the
> content file; there's no other way to display it after all. And that
> can be easily transformed to HTML or OXML.
>
> What David was suggesting (if I understood right) was that the
> bibliographic source file (bibliography.xml or whatever) would, beyond
> the raw metadata, also include pre-rendered chunks for all potential
> citation rendering options for a given style, plus the bibliographic
> entry.
>
> My problem with that is it results in redundant and unnecessary
> content, and pollutes the source file.
>
> > Related to this is an interoperability question. It is important not to
> > focus only on interop with OOo2, after all with a free software
> > package it
> > is easy to get users to upgrade. Consider the interop problems with MS
> > Word.
>
> Given what I say above, do you still see any interop problems?
>
> > Have you considered an approach where citations are stored as rendered
> > text
> > (or footnote/endnotes) in place with a link of some kind back to the
> > bibliographic database with the citation details stored as an item in
> > the
> > database.
>
> That's exactly what the new ODF approach (and the MS OXML approach)
> does ;-)
>
> > That is you would have a (1) Work, with (2) particular expression
> > (is that what you call it) with (3) a citation by page or line number
> > or
> > whatever . That is three items in the database - and only one simple
> > link in
> > the documnet text. Seems to me that this would fit very well with your
> > RDF
> > approach, Bruce. And an approach like this might mean that you could
> > build a
> > solution that could wodk with OpenXML docs as well.
>
> Am not quite following this bit. The plan is:
>
> 1)  content.xml holds the new citation fields, which are:
>         a) link to a source record, and
>         b) rendered citation
>
> 2)  the source metadata gets stored in a dedicated file within the
> wrapper; maybe bibliography/source.xml
>
> 1b gets generated from 2. This is exactly how MS is doing it,
> coincidentally, in OXML.
>
> What David was thinking about was funky citation styles (well, many of
> them, in fact; APA, Chicago, etc.) that distinguish first and
> subsequent citations. The way citeproc works now is, IT has to figure
> out this sort of positional information, and then inserts the right
> formatted version in the output.
>
> The alternative, then, is to just have citeproc be rather dumb about
> it, and create the two representations for each citation, and have the
> new citation support figured out which to use.
>
> Bruce
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Peter (pt) Sefton
Toowoomba 4350
Queensland, Australia
Phone: +61 4 1032 6955
Web: http://ptsefton.com
Email: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: xslt, citeproc-writer

Bruce D'Arcus

On Jun 28, 2006, at 12:24 AM, pt wrote:

> What I'm suggesting for consideration is that you take this example  
> from
> http://wiki.services.openoffice.org/wiki/ 
> Bibliographic_Project%27s_Developer_Page#New_Citation_Coding
>
> <cite:biblioref cite:key="Veer1996a">
>     <cite:detail cite:units="pages" cite:begin="23" cite:end="24"/>
>   </cite:biblioref>
>
> And change it to something like this:
>
> <cite:biblioref cite:key="Veer1996a-citation1">
>
>   </cite:biblioref>
>
> Where there the page-range details are stored in the bibliographic  
> database
> as a new item that relates to the key 'Veer1996a' - so each page-range  
> would
> have its own record in the db.

OK, I see. But I'm not clear on what that buys us.

> (I'm assuming the rendered text will go inside the cite element?)

The cite:citation-body element.

> I'm still of the opinion that you would be better off building an
> bibliographic database with the simplest possible hooks into the file
> format; this could be implemented using a cross reference field or some
> other field that exists already in OpenDocument.

We'll probably revisit the cite:key attribute to ensure we have a  
generic attribute for this linking. That attribute content will most  
likely be a uri, rather than the dumb string in the example on the  
wiki.

> What if the OOo bibliographic tool had its own web server? Then the  
> inline
> text would look like:
>
> <a href="http://localhost:1234/cite-key/Veer1996a/1/">(Veer 1996a pp.
> 23-24)</a>
>
> Where the database would have a record for 'Veer1996a' and a record  
> for each
> page-range cited.

Intuitively I don't think that's very practical. Say I have 50  
citations to a single source in a book; does it really make sense to  
have 50 database entries? For what benefit? That, in fact, goes against  
the grain of all existing citation systems I am aware of, so would seem  
to introduce more interoperability headaches?

> Citeproc could change the rendered citation inside the <a> element
> dynamically as has been proposed elsewhere.
>
> I am suggesting these ideas because I think they would allow an  
> integrated
> tool in OOo to also inter operate  with Word documents, HTML documents  
> and
> and so in the kinds of real environments we see at our university  
> where we
> have Windows, Mac and Linux users running different versions of Word  
> and OOo
> and LaTeX.

Each format is going to have it's own approach to encoding the  
citations, but that's a trivial detail as I see it. If we say a  
citation should be a uri, then you could encode a full citation like  
this if you wanted:

        urn:isbn:2354-1235#page=23?suppress=author

But given ODF is a fairly nicely-designed XML format, we may as well  
have it encoded as clean XML. Besides, the new citation support is  
already approved (has been for about 18 months).

Or maybe I'm just missing something (feel free to correct); about time  
for bed!

Bruce

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]