Links in EPUB3

Among the additions that EPUB3 has added, is a new flexible method of linking inside a document. EPUB3 has two methods of linking, the standard HTML5 and the new method called CFI (Canonical Fragment Identifier).


HTML5 method of linking

The HTML5 method is the simplest, it can reference either the top of a file or a certain element inside the file specified by a unique id. for example: link word This will link to element with id of id2 at chapter2.xhtml. This file should be at the same directory as the current file.


Canonical Fragment Identifier

This is the new method of linking in EPUB3 documents this method may seem complicated it first look. It is complicated but much better flexibility then the standard HTML5 method of linking. The Canonical Fragment Identifier (CFI) can link not only to a page or element, it can link to a specific letter in a paragraph, specific place in an image or a specific second in a video or audio fragment.

Unfortunately today, no reading system I know fully support it. I hope that future reading systems will support it.

In this article I will give an introduction to Canonical Fragment Identifier (CFI for short).

 A Canonical Fragment Identifier or CFI for short looks like:

<a href="book.epub#epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)">link word</a> The CFI is the part inside parentheses (/6/4[chap01ref]!/4[body01]/10[para05]/3:10)

 You should look at this just like a path, each number preceded by a slash (/) is the number of element from the beginning of the OPF file.


Path resolution

CFI works like a path, similar to path in a Unix system but much more versatile and unfortunately complicated.

It will usually start with /6 that means the spine element of the package file.

The package file has three groups of elements: metadata, manifest and spine.

The first numbers are always even, first element is 2 (metadata) second element is 4 (manifest) and third element is 6 (spine).

The next number (4 in my example) means the second itemref in the spine.


But as you can see the CFI is not only slashes (/) and numbers.

The value in square brackets ([chap01ref]) is the id of the element. In this case the id of the second element. It seems redundant to put both the number and the id.

The reason is that when there is an id, the reading system can correct the link if something has change within the publication. Specifying the id when possible make it more readable and reliable.

Following the id in square brackets we see an exclamation mark (!) this means that we are following the id to another file, the file referenced by this itemref.

Inside that file we go to the second element the body (first element in most XHTML files is the head. In this example the body element has an id. This is not a must and actually in most cases body elements do not have an id.

In this case the id is body01.

We then go to the fifth element (/10) in the body.


This element has id of para05 inside para05 we go to the 3ed element which does not have an id. This may look strange that in this case it is not an even number, this is because when we count elements we take into account the beginning of the element and the end of the element.

For example in case of the first number 6, we take into account the opening element and closing element e.g. and

So inside the paragraph we go to the 3ed element.

Inside the 3ed element we need an offset of 10 (:10)


The actual paragraph is:


<p id="para05">xxx<em>yyy</em>0123456789</p>


 We need the 10th element this is the digit 9 in this case.

 This may look complicated example, that's right but this is only a simple case. Things get more complicated when you go inside a video/audio or image.

Because this scheme is very flexible and allows for linking to audio, video or part of an image, it gets complicated.

In case of seconds into an audio or video, the final number should be preceded by a tilde (~) for example: ~2.5 would mean second 2.5 inside an audio or video element.


This really shows the strength of CFI.

CFI can also reference part of an image.


For example @50:50 would mean the center of an image.

The coordinates are given in percent where 0:0 is always the top left corner and 100:100 is the bottom right corner, regardless of actual image size.

First number is horizontal coordinate and second number is vertical coordinate. You can combine it together for a video so that ~23@50:50 means center of frame 23 seconds from the beginning. Note that the characters: tilde (~) or at sign (@) must be the final step.

After the final numbers you can only put square brackets with text before and after position (This text is optional. It is recommended to improve robustness of the search. The reading system uses this text to make sure this is the place, just as it uses id's when possible.


A value in square brackets after the last number means text before or after the position. for example [yyy] means that yyy should be before the location.

[xx,y] means that xx is before the location and y is after the location.

You can specify text only after the location as [,yyy] where yyy is the text after the location.

The complete specification for CFI is at:



Share this post