HyperCard file format
Caveat Emptor!
Although originally intended by Bill Atkinson, the HyperCard file format has never been officially published. The instructions in this file are simply what was deduced by looking at various existing stacks and their differences.
Warning: The information in this document is not complete enough to allow the creation of new HyperCard stacks, but maybe it can be helpful in reading existing stacks and extracting precious data to keep it from being lost.
Prerequisites
Being a file format from Classic MacOS (shipped 1987 through 2004), many of the data types are from that era. All text is encoded in the MacRoman text encoding, and many flags and data types are from the Quickdraw headers, or based on them. All data is stored in Big-Endian format (like the Motorola 68000 used to do).
The block file layout
A HyperCard stack is a stream of blocks, with a four-character type code and a 4-byte signed ID number, terminated by a 'TAIL' block. Each block has the following basic layout:
4 byte | Block size including size, type and ID. |
4 bytes | Block type |
4 bytes | Block ID |
4 bytes | Filler 0 |
n bytes | Data |
The Stack
This block is always the first in a HyperCard file. It is present once:
4 byte | Block size including size, type and ID. | ||||
4 bytes | Block type 'STAK' | ||||
4 bytes | Block ID, always -1 | ||||
4 bytes | Filler 0 | ||||
4 bytes | Format, 0: not HyperCard stack, 1-7: pre-release HyperCard 1.x, 8: HyperCard 1.x, 9: pre-release HyperCard 2.x, 10: HyperCard 2.x | ||||
4 bytes | Total size of the data fork. | ||||
4 bytes | Size of the STAK block. | ||||
4 bytes | Unknown Small value if the stack is large. Could be a hint of the buffer size to alloc in order to read the stack. | ||||
4 bytes | Maximum ever of previous value | ||||
4 bytes | Number of backgrounds in this stack. | ||||
4 bytes | ID of the first background. | ||||
4 bytes | Number of cards in this stack. | ||||
4 bytes | ID of the first card. | ||||
4 bytes | ID of the 'LIST' block in the stack. | ||||
4 bytes | Number of FREE blocks. | ||||
4 bytes | Total size of all FREE blocks (=the free size of this stack). | ||||
4 bytes | ID of the 'PRNT' block in the stack. | ||||
4 bytes | Hash of the password (not resolved, not the same as ask password hash). | ||||
2 bytes | User Level (1 ... 5) for this stack. | ||||
2 bytes | Alignment short | ||||
2 bytes | Protection flags, Bit 10: cantPeek, Bit 11: cantAbort, Bit 13: privateAccess, Bit 14: cantDelete, Bit 15: cantModify. | ||||
2 bytes | Alignment short | ||||
16 bytes | Skip to offset 0x60 | ||||
4 bytes | HyperCard Version at creation. Version Format: xx yy zz rr, xx: major, yy: minor, zz: state (80 final, 60 beta, 40 alpha, 20 development), rr: non-release. For example, 02206044 is version 2.2 beta release 44, and 02418000 is version 2.4.1 final | ||||
4 bytes | HyperCard Version at last compacting. See format above | ||||
4 bytes | HyperCard Version at last modification since last compacting. See format above | ||||
4 bytes | HyperCard Version at last modification. See format above | ||||
4 bytes | Checksum. To check it: cast as int[] the STAK block, from byte 0 to 0x600 (until the start of the script). The sum of the ints must be zero. | ||||
4 bytes | Number of marked cards. | ||||
2 bytes | Top of the card window. | ||||
2 bytes | Left of the card window. | ||||
2 bytes | Bottom of the card window. | ||||
2 bytes | Right of the card window. | ||||
2 bytes | Top of the screen. | ||||
2 bytes | Left of the screen. | ||||
2 bytes | Bottom of the screen. | ||||
2 bytes | Right of the screen. | ||||
2 bytes | X coordinate of the scroll. | ||||
2 bytes | Y coordinate of the scroll. | ||||
2 bytes | Unknown (normally zero). | ||||
2 bytes | Unknown (normally zero). | ||||
288 bytes | Skip to offset 0x1B0 | ||||
4 bytes | ID of the FTBL (font table) block. | ||||
4 bytes | ID of the STBL (style table) block. | ||||
2 bytes | Height in pixels of cards in this stack (default if zero: 342). | ||||
2 bytes | Width in pixels of cards in this stack (default if zero: 512). | ||||
2 bytes | Unknown (normally zero). | ||||
2 bytes | Unknown (normally zero). | ||||
256 bytes | Skip to offset 0x2C0 | ||||
Table of patterns (320 bytes). For each each of the 40 patterns:
| |||||
Table of FREE blocks (offset 0x400, variable size). For each FREE block:
| |||||
Variable | Skip to offset 0x600 | ||||
n bytes | Stack script as a C string, terminated by a NULL byte. If this starts with a null, the stack script is a compiled script for an OSA scripting component. |
The Master
This block is an index of all the blocks present in the HyperCard file (excluding STAK, MAST, FREE, and TAIL blocks). It is always the second block in the file, present once just after the STAK block:
4 byte | Block size including size, type and ID. | ||
4 bytes | Block type 'MAST' | ||
4 bytes | Block ID, always -1 | ||
4 bytes | Filler 0 | ||
16 bytes | Skip to offset 0x20 | ||
In all the remaining bytes (ignore the null entries but loop till the end of the block):
|
The List
This block contains the list of the cards, it is unique in the file but has no defined position. It is necessary to read it because in memory the card blocks are not written in the right order. To speed up insertions/deletions, the list is segmented in sections called pages.
4 byte | Block size including size, type and ID. | ||||
4 bytes | Block type 'LIST' | ||||
4 bytes | Block ID | ||||
4 bytes | Filler =0 | ||||
4 bytes | Number of pages | ||||
4 bytes | Size of a page (normally 0x800) | ||||
4 bytes | Total number of card entries in all the pages, should be equal to the number of cards | ||||
2 bytes | Size of a card entry in the pages | ||||
2 bytes | Unknown (normally 2) | ||||
2 bytes | Number of hash integers in a card entry in the pages, equal to (entry size - 4)/4 | ||||
2 bytes | Search hash value count Cf Page. Gives the number of values used in search hashes. | ||||
4 bytes | Checksum To compute it: Checksum: UInt32 = 0 For every page: Checksum = RotateRight3Bits(Checksum + Identifier) + CardCount | ||||
4 bytes | Total number of entries in all the pages, should be equal to the number of cards | ||||
4 bytes | Skip to offset 0x30 | ||||
For each page:
|
List Page
A page block contains a section of the card list:
4 bytes | Block size including size, type and ID. | ||||||
4 bytes | Block type 'PAGE' | ||||||
4 bytes | Block ID | ||||||
4 bytes | Filler 0 | ||||||
4 bytes | ID of the list | ||||||
4 bytes | Checksum To compute it: Checksum: UInt32 = 0 For every card in the page: Checksum = RotateRight3Bits(Checksum + Identifier) | ||||||
For each card (in the list, look for the number of card entries in that page and the size of every card entry):
|
Each card entry contains a hash intended to speed up word searches. It is equal to all the hashes of every word in the card contents, bitwise OR'ed together. To check if a word is present, one must simply compute its hash, make a bitwise AND with the card search hash and check if it still has the same value. The hashed words are case insensitive and at least 3 character long.
Within the data devoted to the search hash, the hash starts one bit after the beginning and its length is the greatest prime number that can fit. For example if card entries have a size of 16 bytes, the hash data is 16-4-1=11 byte long (don't count card ID and flag byte), which makes 11*8=88 bits, minus the first one: 87 bits. It is not prime, the greatest prime under 87 is 83, so the last 4 bits are never used.
To compute the hash of a word, the characters are first converted in a case insensitive code, reduced to 32 values on 5 bits. In that code, the letters a-z have the values 1-26. Then the digits 0-4 have the values 27-31. As there is no room left, the digits 5-9 are scattered on the codes of the least used letters "j", "q", "v", "x" and "z". Letters with accents are indexed as letters without accents.
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 10 | 17 | 22 | 24 | 26 |
Then a given set of values must be computed. After each value is computed, it must be represented in the hash by activating the bit at the index given by the value, modulo the size of the hash. The bits are counted from left to right. In that way the word hash is progressively formed, starting with only clear bits. Here are the values:
charCodes[0]*4 + charCodes[1]*8192 - (charCodes[1]/8)*65535 + charCodes[2]*256
charCodes[i] + charCodes[i-1]<<5 + charCodes[i-2]<<10 + charCodes[i-3]<<15
Note: the word "the" is ignored, and the numbers often are, even with trailing units like in "26k".
Card
A card block contains the properties of the card, followed by a list of the parts (buttons and fields mixed), followed by a list of the text contents of the parts (including buttons), followed by the card name, followed by the card script.
4 bytes | Block size including size, type and ID. | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | Block type 'CARD' | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | Block ID | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | Filler =0 | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | ID of bitmap block storing card picture (0 means transparent) | ||||||||||||||||||||||||||||||||||||||||||||
2 bytes | Flags, Bit 14: cantDelete, Bit 13: not showPict, Bit 11: dontSearch | ||||||||||||||||||||||||||||||||||||||||||||
2 bytes | Alignment short | ||||||||||||||||||||||||||||||||||||||||||||
8 bytes | Skip to offset 0x20 | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | ID of the page containing this card | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | ID of background | ||||||||||||||||||||||||||||||||||||||||||||
2 bytes | Number of parts | ||||||||||||||||||||||||||||||||||||||||||||
2 bytes | ID to give to a new part | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | Total size of the part list | ||||||||||||||||||||||||||||||||||||||||||||
2 bytes | Number of part contents | ||||||||||||||||||||||||||||||||||||||||||||
4 bytes | Total size of the part content list | ||||||||||||||||||||||||||||||||||||||||||||
For each part:
| |||||||||||||||||||||||||||||||||||||||||||||
For each part content entry:
| |||||||||||||||||||||||||||||||||||||||||||||
n bytes | Name of the card as a zero-terminated C string. | ||||||||||||||||||||||||||||||||||||||||||||
n bytes | Script of the card as a zero-terminated C string. If this starts with something other than a null, the card script is a HyperTalk script. If this starts with a null, the card script is a compiled script for an OSA scripting component. | ||||||||||||||||||||||||||||||||||||||||||||
Optional OSA script data:
|
Background
A background block is similar to a card block except that the header is smaller, with different properties.
4 bytes | Block size including size, type and ID. |
4 bytes | Block type 'BKGD' |
4 bytes | Block ID number |
4 bytes | Filler =0 |
4 bytes | ID of bitmap block storing card picture (0 means transparent) |
2 bytes | Flags, Bit 14: cantDelete, Bit 13: not showPict, Bit 11: dontSearch |
2 bytes | Alignment short |
4 bytes | Number of cards in this background |
4 bytes | Next background ID |
4 bytes | Previous background ID |
2 bytes | Number of parts |
...same as CARD block afterwards |
BitMap
A bitmap stores the picture of a card or a background. It has two layers with one bit per pixel: an image, to tell where the black pixels are, and a mask, to tell where the white pixels are. This is not the classical notion of mask because a pixel activated in the image and not in the mask is black, not transparent. The pixels neither activated in the image and in the mask are transparent.
The mask and the image both have rectangles where they are enclosed, relative to the card coordinates. Outside the rectangles, the pixels are transparent. The mask and image rectangles are not necessarily in the same place.
4 byte | Block size including size, type and ID. |
4 bytes | Block type 'BMAP' |
4 bytes | Block ID |
4 bytes | Filler, =0 |
2 bytes | Unknown (normally 0) |
2 bytes | Unknown (normally 0) |
2 bytes | Unknown (normally 1) |
2 bytes | Unknown (normally 0) |
2 bytes | Top of the card rectangle |
2 bytes | Left of the card rectangle |
2 bytes | Bottom of the card rectangle |
2 bytes | Right of the card rectangle |
2 bytes | Top of the mask rectangle |
2 bytes | Left of the mask rectangle |
2 bytes | Bottom of the mask rectangle |
2 bytes | Right of the mask rectangle |
2 bytes | Top of the image rectangle |
2 bytes | Left of the image rectangle |
2 bytes | Bottom of the image rectangle |
2 bytes | Right of the image rectangle |
4 bytes | Unknown (normally 0) |
4 bytes | Unknown (normally 0) |
4 bytes | Size of the mask data |
4 bytes | Size of the image data |
n bytes | Mask data See format below. |
n bytes | Image data See format below. |
The mask and image data is stored in a compressed image format that Rebecca Bettencourt christened Wrath of Bill Atkinson, or WOBA, for its tortuous complexity. The left side of the bounding rectangle must be rounded down and the right side rounded up to the nearest multiple of 32 before decompressing the data. The compressed data is a series of instructions of various lengths. The first byte of an instruction indicates how long the instruction is and what it does. The remaining bytes, if any, give data needed for that instruction. Rows can be encoded with a single instruction (with opcodes in the range 0x80-0x87) or a sequence of multiple instructions (with opcodes in the ranges of 0x00-0x7F and 0xC0-0xFF). Some operations change the manner in which rows are decoded (opcodes in the range 0x88-0xBF). The end of a row comes either when the row has been filled (the number of bytes in a row, determined by the mask or image's bounding rect, has been reached) or when an instruction in the range 0x80-0xBF is encountered.
Mask and image rectangle can be equal to zero if they are absent. If the mask data is provided, the decompressed mask data is used for the mask. If the mask data is not provided but the mask rectangle is provided, the mask rectangle is used for the mask, as if all the pixels in the rectangle were equal to 1. If neither the mask data nor the mask bounding rectangle are provided, there is no mask (which does not mean that the image is transparent).
Instructions are listed below:
Opcode | Operation | Description |
0x00-0x7F | dz xx xx xx ... | z zero bytes followed by d data bytes |
0x80 | 80 xx xx xx ... | one row of uncompressed data |
0x81 | 81 | one white row |
0x82 | 82 | one black row |
0x83 | 83 xx | one row of a repeated byte of data |
0x84 | 84 | one row of a repeated byte of data previously used Keep an array of eight bytes. Initialize it with 0xAA55AA55AA55AA55. When a 0x83 instruction is encountered, take the row y-coordinate modulo 8, and put the byte into that element of the array. When a 0x84 instruction is encountered, take the row y-coordinate modulo 8, and use that element of the array to fill the row. |
0x85 | 85 | copy the previous row |
0x86 | 86 | copy the row before the previous row |
0x87 | 87 | not used |
0x88 | 88 | dh = 16, dv = 0 Initially, dh = 0 and dv = 0. These instructions change dh and dv for every row after the instruction. Every time a row is completed, the following operations are performed: Make a copy of the row. If dh != 0, repeat until the end of the row: Shift the copied row to the right dh bits. XOR the copied row with the original row. If dv != 0, XOR the copied row with the row dv rows back. Copy the row back to the original. Rows compressed with an 0x80-0x87 opcode are not affected. |
0x89 | 89 | dh = 0, dv = 0 See above |
0x8A | 8A | dh = 0, dv = 1 See above |
0x8B | 8B | dh = 0, dv = 2 See above |
0x8C | 8C | dh = 1, dv = 0 See above |
0x8D | 8D | dh = 1, dv = 1 See above |
0x8E | 8E | dh = 2, dv = 2 See above |
0x8F | 8F | dh = 8, dv = 0 See above |
0x90-0x9F | not used | |
0xA0-0xBF | 101nnnnn | repeat the next instruction n times |
0xC0-0xDF | 110ddddd xx ... | d*8 bytes of data |
0xE0-0xFF | 111zzzzz | z*16 bytes of zero |
The Style Table
The Style Table stores all the formatting used in the multi-styled text fields of the stack. It appears only once in the file. The styles have the format defined by the old TextEdit API.
4 byte | Block size including size, type and ID. | ||||||||||||||||||||||
4 bytes | Block type 'STBL' | ||||||||||||||||||||||
4 bytes | Block ID | ||||||||||||||||||||||
4 bytes | Filler, =0 | ||||||||||||||||||||||
4 bytes | Number of styles | ||||||||||||||||||||||
4 bytes | Style ID to use for next style | ||||||||||||||||||||||
For each style:
|
The Font Table
Since font IDs were not consistent across the installations, HyperCard stores a table of the names of the fonts used in the stack. This block appears only once in a file.
4 byte | Block size including size, type and ID. | ||||||
4 bytes | Block type 'FTBL' | ||||||
4 bytes | Block ID | ||||||
4 bytes | Filler, =0 | ||||||
4 bytes | Number of fonts | ||||||
4 bytes | Unknown (normally 0) | ||||||
For each font:
|
The Print Setting
This blocks contains the HyperCard print settings and template indexes. It only appears once in the file.
4 byte | Block size including size, type and ID. | ||||||||
4 bytes | Block type 'PRNT' | ||||||||
4 bytes | Block ID | ||||||||
4 bytes | Filler, =0 | ||||||||
32 bytes | Skip to offset 0x30 (unknown format in-between) | ||||||||
2 bytes | Page Set-Up (PRST block) ID | ||||||||
258 bytes | Skip to offset 0x134 (unknown format in-between) | ||||||||
2 bytes | Number of report templates | ||||||||
For each template:
|
The Page Set-Up
This block is the Mac OS print setting. It is the same structure as is documented in Inside Macintosh: Imaging with QuickDraw: Printing Manager, except with the HyperCard block header attached to it.
4 byte | Block size including size, type and ID. |
4 bytes | Block type 'PRST' |
4 bytes | Block ID |
4 bytes | Filler, =0 |
2 bytes | Printing Manager version that initialized this record |
2 bytes | Used internally by Printing Manager |
2 bytes | Vertical resolution (dots per inch) |
2 bytes | Horizontal resolution (dots per inch) |
2 bytes | Printable page top, always 0 The rectangle is counted from the top-left of the printable page |
2 bytes | Printable page left, always 0 The rectangle is counted from the top-left of the printable page |
2 bytes | Printable page bottom The rectangle is counted from the top-left of the printable page |
2 bytes | Printable page right The rectangle is counted from the top-left of the printable page |
2 bytes | Paper top The rectangle is counted from the top-left of the printable page |
2 bytes | Paper left The rectangle is counted from the top-left of the printable page |
2 bytes | Paper bottom The rectangle is counted from the top-left of the printable page |
2 bytes | Paper right The rectangle is counted from the top-left of the printable page |
2 bytes | Printer device number |
2 bytes | PageV |
2 bytes | PageH |
1 byte | Port |
1 byte | Feed type: continuous or separate sheets |
2 bytes | Used internally by Printing Manager |
2 bytes | Vertical Resolution |
2 bytes | Horizontal Resolution |
2 bytes | Page top |
2 bytes | Page left |
2 bytes | Page bottom |
2 bytes | Page right |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
2 bytes | First page |
2 bytes | Last page |
2 bytes | Number of copies |
1 byte | Printing method, 0: draft, 1: deferred |
1 byte | Used internally by Printing Manager |
4 bytes | Background procedure pointer, should be 0 |
4 bytes | Spool file name |
2 bytes | Spool file volume |
1 byte | Spool file version |
1 byte | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
4 bytes | Reserved |
Report template
4 byte | Block size including size, type and ID. | ||||||||||||||||||||||||||||
4 bytes | Block type 'PRNT' | ||||||||||||||||||||||||||||
4 bytes | Block ID | ||||||||||||||||||||||||||||
4 bytes | Filler, =0 | ||||||||||||||||||||||||||||
1 byte | Unit, 0: centimeters, 1: millimeters, 2: inches, 3: points/pixels | ||||||||||||||||||||||||||||
1 byte | Alignment to 16 bits | ||||||||||||||||||||||||||||
2 bytes | Margin top | ||||||||||||||||||||||||||||
2 bytes | Margin left | ||||||||||||||||||||||||||||
2 bytes | Margin bottom | ||||||||||||||||||||||||||||
2 bytes | Margin right | ||||||||||||||||||||||||||||
2 bytes | Spacing height | ||||||||||||||||||||||||||||
2 bytes | Spacing width | ||||||||||||||||||||||||||||
2 bytes | Cell height | ||||||||||||||||||||||||||||
2 bytes | Cell width | ||||||||||||||||||||||||||||
2 bytes | Flags, Bit 8: left to right (as opposed to top to bottom), Bit 0: dynamic height | ||||||||||||||||||||||||||||
1 byte | Header length | ||||||||||||||||||||||||||||
n bytes | Header The following control characters can be embedded in the header string: 0x01 (control-A): date, 0x02 (control-B): time, 0x03 (control-C): stack name, 0x04 (control-D): page number | ||||||||||||||||||||||||||||
2 bytes | Number of report items | ||||||||||||||||||||||||||||
For each report item:
|
The Tail
This block is always at the end of a HyperCard file.
4 byte | Block size including size, type and ID. |
4 bytes | Block type 'TAIL' |
4 bytes | Block ID, always -1 |
4 bytes | Filler, =0 |
1 byte | Length of the tail string |
n bytes | Tail string (without ending null) For HyperCard 1.x: "That's all folks..." For HyperCard 2.x: "Nu är det slut…" |
If you have found out more about this file format, feel free to amend this document and let us know.
Written up by Mister Z and Pierre Lorenzi.