<LinkOptions> <MaximumDepth value="1"/> <FollowOffsite value="yes"/> <MaximumOffsiteDepth value="1"/> <SubDirOnly value="no"/> <UnresolvedDetail value="include"/> <Exclude> <Pattern>RE::table[1-9].jpg</Pattern> <Pattern>WC::figures*plant?blue</Pattern> </Exclude> <Include> <Pattern>RE:C:Table8</Pattern> </Include> <ExternalDocuments> <Specification> <Path>DocumentB</Path> <Prefix>DocumentB</Prefix> <KeepPrefix value="no"/> <MapFile>c:\Docs\DocumentB\DocumentB.map</MapFile> <Lookup value="ByID"/> </Specification> </ExternalDocuments> </LinkOptions>
|value||1||Maximum link depth to follow|
|value||yes||Whether to follow offsite links|
|value||1||Maximum link depth for following offsite links. This option requires version 4.25 or later of iSiloX and iSiloXC.|
|value||no||Whether to limit followed links to subdirectories. This option requires version 3.15 or later of iSiloX and iSiloXC.|
|value||include||Whether to include unresolved URLs|
|multi-string||n/a||URL exclusion filters. This option requires version 3.3 or later of iSiloX and iSiloXC.|
|multi-string||n/a||URL exclusion exception filters. This option requires version 3.3 or later of iSiloX and iSiloXC.|
|container||n/a||Holds the specifications for links to external documents. This option requires version 4.1 or later of iSiloX and iSiloXC.|
If you are creating a document based on a Web site, you are
recommended to use a maximum depth value of one because each
additional increment in depth beyond one will likely cause an
exponential increase in the size of the document. For example,
at a link depth of one, if the converted document is one megabyte
in size, at a link depth of two, it might be ten megabytes, and
at a link depth of three, it could be 100 megabytes.
This example specifies a maximum depth value of three, which results in all content up to a link depth of three being included in the converted document.
<MaximumDepth value="3"/>This example specifies a maximum depth value of zero, which results in no additional content other than the root source file content being included in the converted document.
yesto follow off-site links or provide the value
noto not follow off-site links.
An off-site link is defined as a link to a target in a different domain.
iSiloX and iSiloXC treat all file paths as belonging to the same domain.
For URLs, they treat the the protocol (e.g., http://) and the
hostname as comprising the domain. To tell iSiloX or iSiloXC to not
follow links to targets in different domains, set the value attribute
of the FollowOffsite element to
no. This is useful
to limit the amount of irrelevant content brought into the document.
iSiloX and iSiloXC perform the off-site link check anew for each
root source file. What this means is that you can have root source
files in different domains. For example, you can have two root source
files, one with the URL <http://www.iSilo.com> and another with
the URL <http://www.palm.com>. Assuming that you have set the
value attribute of the FollowOffsite element to
then when iSiloX or iSiloXC convert the content at
<http://www.iSilo.com>, only links from there with
target URLs that begin with <http://www.iSilo.com> are followed.
When either converts the content at <http://www.palm.com>,
only links from there with target URLs that begin with
<http://www.palm.com> are followed. If the content
at <http://www.palm.com> had a link to
<http://www.iSilo.com/whatsnew.htm>, the link will not be followed.
This example specifies that off-site links should be followed.
<FollowOffsite value="yes"/>This example specifies that off-site links should not be followed.
no. If you do not specify the MaximumOffsiteDepth element, it defaults to one. The depth is relative to the source file containing the off-site link, rather than relative to the root source files. If the FollowOffsite element described above is set to
no, then the MaximumOffsiteDepth element has no effect.
Note that the value of the MaximumDepth element still limits the total link depth. So if the MaximumDepth element is set to two and the MaximumOffsiteDepth element is set to one, and there is an off-site link from a source file at depth two, that link is not followed, although it is at a depth of one relative to the source file with that off-site link.
The MaximumOffsiteDepth element is useful in the case where
you specify a MaximumDepth value greater than one in order
to include more content from a given site but want to allow
links to off-site articles.
This example specifies a maximum off-site depth value of two, which results in all off-site content up to a link depth of two being included in the converted document.
<MaximumOffsiteDepth value="2"/>This example specifies a maximum off-site depth value of zero, which results in no off-site content being included in the converted document.
yesto limit followed links to those matching the subdirectory of the root source path. Specify the value
noto allow all links to be followed.
In many cases, websites are structured hierarchically within
folders and sub-folders. And in such cases, it is also probably
the case that the URLs referencing the pages of such a site
are also orgznied as such, with slashes separating the different
levels of folders. For example, the iSiloX.com website has
all support pages within a folder named "support". Within the
support folder, there are sub-folders for different categories
of support, such as a sub-folder named "manual" where the
manuals are located. However, such sub-folder pages may also
have links to pages outside of the folder. If you want
to limit followed links to only sub-folders of the
root source pages then you can
set the value attribute of the SubDirOnly element
to the value
If you do, then iSiloX only follows links which match up to the
last slash of any of the root source URLs.
As an example, if you wanted to get all the support pages
from the iSiloX.com website, you might specify
http://www.iSiloX.com/support/index.htm as the root source
URL and set the value attribute of the SubDirOnly element
to the value
yes. The page
has a reference to the home page of the site http://www.iSiloX.com.
However, because you have set the value attribute of the SubDirOnly
element to the value
yes, that link
will not be followed. However, a link such as
http://www.iSiloX.com/support/faq.htm to the frequently asked
questions page will be followed.
This example specifies that only links to targets within subdirectories should be followed.
<SubDirOnly value="yes"/>This example specifies that even links to targets outside of the root subdirectories can be followed.
If you choose to include the unresolved link detail, iSiloX and iSiloXC create a document with an additional page at the end that lists the URLs of all unresolved links. The target of each unresolved link in the document jumps to its corresponding target URL on this last page. This is useful for later reference and for finding broken hyperlinks.
If you choose not to include the unresolved link detail, the unresolved hyperlinks essentially have no target. When viewing the document within a reader and attempting to follow such a hyperlink, the reader will tell you that the hyperlink was unresolved, but gives no indication of the target URL.
The most common sources of unresolved links are the following:
<UnresolvedDetail value="include"/>This example specifies that unresolved link detail should not be included.
<Exclude>start tag and the
</Exclude>end tag, specify one or more Pattern elements. Each Pattern element is an exclusion filter specified using either a wildcard or regular expression pattern matching string. If the URL of an image matches against one of the exclusion patterns, it is not included in the document. If the target URL of a link matches against one of the exclusion patterns, the link is not followed and hence the target content is not included in the document. Exceptions to exclusions can be specified using the Include element.
The format of a pattern string is:
type:options:patternIn the above, type is either
WCfor a wildcard pattern or
REfor a regular expression pattern, with pattern being the pattern in the format of the specified type to match against. For option, specify
Cto perform a case-sensitive match. By default, matching is case-insensitive, with the lowercase letters 'a' through 'z' matching the uppercase letters 'A' through 'Z'.
A pattern can be either a wildcard pattern or a regular expression pattern:
<Exclude> <Pattern>RE::table[1-9].jpg</Pattern> <Pattern>WC::figures*plant?blue</Pattern> </Exclude>The first pattern specifies a regular expression pattern with no options, so the match will be case-insensitive. The pattern matches the text "table" followed by any digit character from '0' through '9' and then followed by the text ".jpg". So the pattern will match against any of the following:
<Include>start tag and the
</Include>end tag, specify one or more Pattern elements. Each Pattern element is an inclusion filter specified using either a wildcard or regular expression pattern matching string. An inclusion filter serves as an exception to the exclusion filters. If a given URL matches against an exclusion filter the inclusion filters are applied to the URL, and if there is a match against an inclusion filter, the URL is not excluded. For details on how to specify the pattern, see the section on the Exclude element.
<Include> <Pattern>RE:C:Table8</Pattern> </Include>If this example is taken in conjunction with the example given for the Exclude element, then although the exclusion filters exclude the URL "http://www.acme.org/Table8.jpg", this inclusion filter notes it as an exception and causes it not to be excluded. Note that in this inclusion pattern, the option C has been specified for a case-sensitive match, and so "http://www.acme.org/table8.jpg" would not be noted as an exception.
|container||n/a||Holds information about an external document specification.|
|string||n/a||Relative path to the external document.|
|string||n/a||External link identification prefix.|
|value||no||Whether to include the prefix for lookup.|
|string||n/a||Path to a map file.|
|value||ByName||Method to use for target lookup.|
Note that on Palm OS® that if a file is stored in the internal
storage memory that the document title serves as the file name,
so when converting external documents, it is best to ensure
that the document title and document file name are the same.
Also, on Palm OS®, when a document is stored in the internal
storage memory, any external documents to which it links must also
be stored in the internal memory, and in this case, the reader
application ignores the directory part of external document paths.
Version 4.3 and later of iSilo™ support searching for the first of multiple possibilities. You can specify multiple names to search for by enclosing each name within double-quote characters and separating each double-quote enclosed name from the next with a space. When you do this, iSilo™ opens the first document that it finds in the order listed. This is especially useful in the case for Palm OS®, where when a document is in the internal database storage memory, its internal database name is used since there is no notion of a file name, but when a document is on a memory card, its file name is used.
This example specifies that the external document will be a file named "DocumentB" in the same directory.
<Path>DocumentB</Path>This example specifies that the external document will be a file named "Main Index.pdb" in the directory one level above.
<Path>../Main Index.pdb</Path>This example specifies that the external document will be a file named "The Art of War" in the directory named "Classics".
<Path>Classics/The Art of War</Path>This example specifies three different possibilites for the external document:
<Path>"Gulliver's Travels" "Gulliver_s_Travels.pdb" "Gul. Travels"</Path>
<Prefix>DocumentB</Prefix>The prefix would match against a URL such as "DocumentB/index.htm#TableOfContents" or to a URL such as "../DocumentB/How-To/Write A Story.htm".
<KeepPrefix>tag determines whether the prefix is kept for lookup. Set the value attribute of the tag to
noto not include the prefix or to
yesto include the prefix.
As an example of a scenario where the prefix should not be included in the lookup, consider two documents, call them document A and document B, that externally link to one another such that each document's content is wholly contained in its own directory. Say that the directory containing document A's content is named DirA and that the directory for document B's content is named DirB. In order for document A to link to document B, for document A, you would specify DirB as the prefix for identifying links as those to document B. For document B, you would specify DirA as the prefix for identifying links as those to document A. The target names within a given document are relative to the first source, which would presumably be some file immediately within the document's directory. Hence, the directory name would not be part of the target name and thus the prefix, which would be the same as the directory name, should not be included for lookup.
As an example of a scenario where the prefix should be included in the lookup, consider two documents, call them document A and document B, that externally link to one another such that each document's content is spread across two directories. Say that the directories containing document A's content are DirA1 and DirA2 and that the directories containing document B's content are named DirB1 and DirB2. Further, say that the directory containing all four directories is named DirAB. In addition, say that an index file immediately within DirAB links to content in all four subdirectories DirA1, DirA2, DirB1, and DirB2. To create the two documents that link externally to one another, for document A, you would specify two external document specifications, both for externally linking to document B. For the first specification, DirB1 would be the prefix. For the second specification, DirB2 would be the prefix. But since the index file is at the same level as those two directories, you would want to keep the prefix.
This example specifies that the prefix not be included for lookup.
<KeepPrefix value="no"/>This example specifies that the prefix should be included for lookup.
When converting the target external document, use the
to generate the map file. If two documents link to one another,
it is necessary to perform two conversion passes. The first
pass generates the map file and the second pass uses the map
files for looking up the associated target IDs or offsets.
This example specifies the full path of a map file on a Windows® based computer.
<Lookup>tag determines the format in which the link information is stored as well as how the lookup is performed in the external document. Set it to one of the following:
ByName: The part of the URL of the link after the prefix is considered the target name and stored as the value to use to identify the target location within the external document when a jump to the target occurs. In order for the links to the target document to work properly, the target document must have been converted with the value attribute of the
<Lookup>tag within its
<Targets>element set to
ByID: A numeric value, also known as the target ID, that uniquely identifies the target is stored and used to identify the target location within the external document when a jump to the target occurs. A map file for the external document is needed to lookup the target ID values of the external document during conversion.
ByOffset: A numeric value, also known as the target offset, that represents the location of the target in the external document is stored and used when a jump to the target occurs. A map file for the external document is needed to lookup the target offset values of the external document during conversion.
The lookup methods each have their own individual advantages and disadvantages.
For the document storage space tradeoffs among the methods, the ByName method requires the largest amount of storage space in the linking document as well as in the targeted external document unless the number of target names are very few and short in length. The ByID and ByOffset methods require approximately the same amount of storage space as each other in the linking document. In the targeted external document, the ByOffset method requires no additional storage space, while the ByID method requires an amount of storage space that is generally less than the ByName method.
In terms of the speed of performing the lookup when a jump occurs to an external document, the difference perceived by the user is probably negligible. But the ByName method requires the most amount of processing. The ByID method comes next, while the ByOffset method requires the least amount of processing for lookup.
The other important tradeoff among the methods concerns synchronization between a document and the external documents to which it links. For the purposes of this discussion, let us say that we have a document named DocSource that has links to an external document named DocTarget and that DocTarget is updated indepedent of DocSource. The content and targets in DocTarget change periodically such that content and targets may be added and removed. Assume though that the targets to which DocSource links to in DocTarget are always there, though the specific location of the targets within the content of DocTarget may change.
Given the scenario just described, if the lookup method is ByName, even though DocTarget may undergo many changes and DocSource stays the same, the links from DocSource to DocTarget will always work.
If the lookup method is ByID this may not be the case. The IDs assigned to each target within DocTarget depend to some extent on all other external targets within DocTarget. If DocTarget gets a new target or one is removed, the target IDs for the other targets may change. As a result, the target IDs stored in DocSource for the targets in DocTarget may become invalid. However, if only the content in DocTarget changes, the target IDs will still be valid.
If the lookup method is ByOffset, then neither the content nor the targets in DocTarget may change if the links from DocSource to DocTarget are to remain valid.
The ByName lookup method, though requiring the most storage space,
is the best method to use for documents that can change independent
of one another. The ByOffset lookup method requires the least
amount of storage space and is a good method to use for documents
that will change together. The ByID lookup method generally requires
only a modest amount of storage space compared to the ByName method
and is a good method to use when only changes to the content,
such as minor corrections, are expected to occur in an external
This example specifies that the target lookup be by name.
<Lookup value="ByName"/>This example specifies that the target lookup be by ID.
<Lookup value="ByID"/>This example specifies that the target lookup be by offset.