You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update manubot for more flexible citation processing
merges manubot/rootstock#342
Includes major updates to how citations are processed by the pandoc-manubot-cite filter.
See the following commit messages for more information:
- manubot/manubot@7055bcc
- manubot/manubot@47b03e0
Removes requirement to prefix certain citekeys with raw: or tag:
Removes support for deprecated `content/citation-tags.tsv`.
Switches from tag to alias terminology for citation aliases.
closesmanubot/manubot#120
Removes pandas and jsonref dependencies, which are no longer needed.
Manubot supports [Pandoc citations](https://pandoc.org/MANUAL.html#citations), but with added support for citing persistent identifiers directly.
80
80
Citations are processed in 3 stages:
81
81
82
82
1. Pandoc parses the input Markdown to locate citation keys.
83
-
2. The [`pandoc-manubot-cite`](https://github.com/manubot/manubot#pandoc-filter)filter automatically retreives the bibliographic metadata for citation keys.
84
-
3. The [`pandoc-citeproc`](https://github.com/jgm/pandoc-citeproc/blob/master/man/pandoc-citeproc.1.md) filter renders in-text citations and generates styled references.
83
+
2. The [`pandoc-manubot-cite` filter](https://github.com/manubot/manubot#pandoc-filter) automatically retrieves the bibliographic metadata for citation keys.
84
+
3. The [`pandoc-citeproc` filter](https://github.com/jgm/pandoc-citeproc/blob/master/man/pandoc-citeproc.1.md) renders in-text citations and generates styled references.
85
85
86
-
When using Manubot, citation keys should be formatted like `@source:identifier`,
87
-
where `source` is one of the options described below.
86
+
When citing persistent identifiers, citation keys should be formatted like `@prefix:accession`,
87
+
where `prefix` is one of the options described below.
88
88
When choosing which source to use for a citation, we recommend the following order:
89
89
90
90
1. DOI (Digital Object Identifier), cite like `@doi:10.15363/thinklab.4`.
91
91
Shortened versions of DOIs can be created at [shortdoi.org](http://shortdoi.org/).
92
92
shortDOIs begin with `10/` rather than `10.` and can also be cited.
93
93
For example, Manubot will expand `@doi:10/993` to the DOI above.
94
94
We suggest using shortDOIs to cite DOIs containing forbidden characters, such as `(` or `)`.
95
-
2. PubMed Central ID, cite like `@pmcid:PMC4497619`.
96
-
3. PubMed ID, cite like `@pmid:26158728`.
95
+
2. PubMed Central ID, cite like `@pmc:PMC4497619`.
96
+
3. PubMed ID, cite like `@pubmed:26158728`.
97
97
4._arXiv_ ID, cite like `@arxiv:1508.06576v2`.
98
98
5. ISBN (International Standard Book Number), cite like `@isbn:9781339919881`.
99
-
6. URL / webpage, cite like `@url:https://nyti.ms/1QUgAt1`.
99
+
6. URL / webpage, cite like `@https://nyti.ms/1QUgAt1`.
100
100
URL citations can be helpful if the above methods return incorrect metadata.
101
-
For example, `@doi:10.1038/ng.3834`[incorrectly handles](https://github.com/manubot/manubot/issues/158) the consortium name resulting in a blank author, while `@url:https://doi.org/10.1038/ng.3834` succeeds.
102
-
Similarly, `@url:https://doi.org/10.1101/142760` is a [workaround](https://github.com/manubot/manubot/issues/16) to set the journal name of bioRxiv preprints to _bioRxiv_.
101
+
For example, `@doi:10.1038/ng.3834`[incorrectly handles](https://github.com/manubot/manubot/issues/158) the consortium name resulting in a blank author, while `@https://doi.org/10.1038/ng.3834` succeeds.
102
+
Similarly, `@https://doi.org/10.1101/142760` is a [workaround](https://github.com/manubot/manubot/issues/16) to set the journal name of bioRxiv preprints to _bioRxiv_.
103
103
7. Wikidata Items, cite like `@wikidata:Q50051684`.
104
-
Note that anyone can edit or add records on [Wikidata](https://www.wikidata.org), so users are encouraged to contribute metadata for hard-to-cite works to Wikidata as an alternative to using a `raw` citation.
105
-
8. For references that do not have any of the persistent identifiers above, use a raw citation like `@raw:old-manuscript`.
106
-
Metadata for raw citations must be provided manually.
104
+
Note that anyone can edit or add records on [Wikidata](https://www.wikidata.org), so users are encouraged to contribute metadata for hard-to-cite works to Wikidata.
105
+
8. Any other compact identifier supported by <https://identifiers.org>.
106
+
Manubot uses the Identifiers.org Resolution Service to support [hundreds](https://github.com/manubot/manubot/blob/7055bcc6524fdf1ef97d838cf0158973e2061595/manubot/cite/handlers.py#L122-L831"Actual prefix support is determined by this manubot source code.") of [prefixes](https://registry.identifiers.org/registry"Identifiers.org prefix search").
107
+
For example, citing `@clinicaltrials:NCT04280705` will produce the same bibliographic metadata as `@https://identifiers.org/clinicaltrials:NCT04280705` or `@https://clinicaltrials.gov/ct2/show/NCT04280705`.
108
+
9. For references that do not have any of the above persistent identifiers, the citation key does not need to include a prefix.
109
+
Citing `@old-manuscript` will work, but only if reference metadata is [provided manually](#reference-metadata).
107
110
108
111
Cite multiple items at once like:
109
112
110
113
```md
111
-
Here is a sentence with several citations [@doi:10.15363/thinklab.4; @pmid:26158728; @arxiv:1508.06576; @isbn:9780394603988].
114
+
Here is a sentence with several citations [@doi:10.15363/thinklab.4; @pubmed:26158728; @arxiv:1508.06576; @isbn:9780394603988].
112
115
```
113
116
114
117
Note that multiple citations must be semicolon separated.
115
118
Be careful not to cite the same study using identifiers from multiple sources.
116
-
For example, the following citations all refer to the same study, but will be treated as separate references: `[@doi:10.7717/peerj.705; @pmcid:PMC4304851; @pmid:25648772]`.
119
+
For example, the following citations all refer to the same study, but will be treated as separate references: `[@doi:10.7717/peerj.705; @pmc:PMC4304851; @pubmed:25648772]`.
117
120
118
121
Citation keys must adhere to the syntax described in the [Pandoc manual](https://pandoc.org/MANUAL.html#citations):
119
122
120
123
> The citation key must begin with a letter, digit, or `_`, and may contain alphanumerics, `_`, and internal punctuation characters (`:.#$%&-+?<>~/`).
121
124
122
125
To evaluate whether a citation key fully matches this syntax, try [this online regex](https://regex101.com/r/mXZyY2/latest).
123
-
If the citation key is not valid, use the [citation tag](#citation-tag) workaround below.
126
+
If the citation key is not valid, use the [citation aliases](#citation-aliases) workaround below.
124
127
This is required for citation keys that contain forbidden characters such as `;` or `=` or end with a non-alphanumeric character such as `/`.
125
128
<!-- See [jgm/pandoc#6026](https://github.com/jgm/pandoc/issues/6026) for progress on a more flexible Markdown citation key syntax. -->
126
129
@@ -134,66 +137,63 @@ pandoc:
134
137
manubot-fail-on-errors: True
135
138
```
136
139
137
-
#### Citation tags
140
+
#### Citation aliases
138
141
139
-
The system also supports citation tags, which map from one citation key (an alias) to another.
140
-
Tags are recommended for the following applications:
142
+
The system also supports citation aliases, which map from one citation key (the "alias" or "tag") to another.
143
+
Aliases are recommended for the following applications:
141
144
142
-
1. A citation's identifier contains forbidden characters, you must use a tag.
145
+
1. A citation key contains forbidden characters.
143
146
2. A single reference is cited many times.
144
-
Therefore, it might make sense to define a tag, so if the citation updates (e.g. a newer version becomes available), only a single change is required.
147
+
Therefore, it might make sense to define an alias, so if the citation updates (e.g. a newer version becomes available), only a single change is required.
145
148
146
-
Tags can be defined using Markdown's [link reference syntax](https://spec.commonmark.org/0.29/#link-reference-definitions) as follows:
149
+
Aliases can be defined using Markdown's [link reference syntax](https://spec.commonmark.org/0.29/#link-reference-definitions) as follows:
147
150
148
151
```markdown
149
-
Citing a URL containing a `?` character [@tag:my-url].
150
-
Citing a DOI containing parentheses [@doi:my-doi].
152
+
Citing a URL containing a `?` character [@my-url].
For backwards compatibility, tags can also be defined in `content/citation-tags.tsv`.
170
-
If `citation-tags.tsv` defines the tag `study-x`, then this study can be cited like `@tag:study-x`.
171
-
This method is deprecated.
172
-
173
172
## Reference metadata
174
173
175
174
Manubot stores the bibliographic details for references (the set of all cited works) as CSL JSON ([Citation Style Language Items](http://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html#csl-json-items)).
176
-
For all citation sources besides `raw`, Manubot automatically generates CSL JSON.
175
+
Manubot automatically generates CSL JSON for most persistent identifiers (as described in [Citations](#citations) above).
177
176
In some cases, automatic metadata retrieval fails or provides incorrect or incomplete information.
178
-
Errors are most common for `url` references.
177
+
Errors are most common for references generated from scraping HTML metadata from websites.
178
+
This occurs most frequently for `https`/`http`/`url` citations as well as identifiers.org prefixes without explicit support listed above.
179
179
Therefore, Manubot supports user-provided metadata, which we refer to as "manual references".
180
180
When a manual reference is provided, Manubot uses the supplied metadata and does not attempt to generate it.
181
181
182
182
Manubot searches the `content` directory for files that match the glob pattern `manual-references*.*` and expects that these files contain manual references.
183
183
[`content/manual-references.json`](content/manual-references.json) is the default file to specify custom CSL JSON metadata.
184
184
Manual references are matched to citations using their "id" field.
185
-
For example, to manually specify the metadata for the citation `@url:https://github.com/manubot/rootstock`, add a CSL JSON Item to `manual-references.json` that contains the following excerpt:
185
+
For example, to manually specify the metadata for the citation `@https://github.com/manubot/rootstock`, add a CSL JSON Item to `manual-references.json` that contains the following excerpt:
186
186
187
187
```json
188
-
"id": "url:https://github.com/manubot/rootstock",
188
+
"id": "https://github.com/manubot/rootstock",
189
189
```
190
190
191
-
The metadata for `raw` citations must be provided in a manual reference file (e.g. `manual-references.json`) or an error will occur.
192
-
For example, to cite `@raw:private-message` in a manuscript, a corresponding CSL JSON Item is required, such as:
191
+
The metadata for unhandled citations — any citation key that is a not a supported persistent ID — must be provided in a manual reference file (e.g. `manual-references.json`) or an error will occur.
192
+
For example, to cite `@private-message` in a manuscript, a corresponding CSL JSON Item is required, such as:
193
193
194
194
```json
195
195
{
196
-
"id": "raw:private-message",
196
+
"id": "private-message",
197
197
"type": "personal_communication",
198
198
"title": "Personal communication with Doctor X"
199
199
}
@@ -204,10 +204,10 @@ For guidance on what CSL JSON should be like for different document types, refer
204
204
205
205
Manubot offers some support for other bibliographic metadata formats besides CSL JSON, by delegating conversion to the `pandoc-citeproc --bib2json` [utility](https://github.com/jgm/pandoc-citeproc/blob/master/man/pandoc-citeproc.1.md#convert-mode).
206
206
Formats are inferred from filename extensions.
207
-
So, for example, to provide metadata for `@url:https://github.com/manubot/rootstock` in BibTeX format, create the file `content/manual-references.bib` and create an item whose definition starts with the excerpt:
207
+
So, for example, to provide metadata for `@https://github.com/manubot/rootstock` in BibTeX format, create the file `content/manual-references.bib` and create an item whose definition starts with the excerpt:
208
208
209
209
```latex
210
-
@misc{url:https://github.com/manubot/rootstock,
210
+
@misc{https://github.com/manubot/rootstock,
211
211
```
212
212
213
213
Processed reference metadata in CSL JSON format, either generated by Manubot or specified via manual references, is exported to `references.json`.
Copy file name to clipboardExpand all lines: content/02.delete-me.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,24 +95,24 @@ Bare URL link: <https://manubot.org>
95
95
96
96
Citation by DOI [@doi:10.7554/eLife.32822].
97
97
98
-
Citation by PubMed Central ID [@pmcid:PMC6103790].
98
+
Citation by PubMed Central ID [@pmc:PMC6103790].
99
99
100
-
Citation by PubMed ID [@pmid:30718888].
100
+
Citation by PubMed ID [@pubmed:30718888].
101
101
102
102
Citation by Wikidata ID [@wikidata:Q56458321].
103
103
104
104
Citation by ISBN [@isbn:9780262517638].
105
105
106
-
Citation by URL [@url:https://greenelab.github.io/meta-review/].
106
+
Citation by URL [@https://greenelab.github.io/meta-review/].
107
107
108
-
Citation by tag[@tag:deep-review].
108
+
Citation by alias[@deep-review].
109
109
110
-
Multiple citations can be put inside the same set of brackets [@doi:10.7554/eLife.32822; @tag:deep-review; @isbn:9780262517638].
111
-
Manubot plugins provide easier, more convenient visualization of and navigation between citations [@doi:10.1371/journal.pcbi.1007128; @pmid:30718888; @pmcid:PMC6103790; @tag:deep-review].
110
+
Multiple citations can be put inside the same set of brackets [@doi:10.7554/eLife.32822; @deep-review; @isbn:9780262517638].
111
+
Manubot plugins provide easier, more convenient visualization of and navigation between citations [@doi:10.1371/journal.pcbi.1007128; @pubmed:30718888; @pmc:PMC6103790; @deep-review].
112
112
113
113
Citation tags (i.e. aliases) can be defined in their own paragraphs using Markdown's reference link syntax:
0 commit comments