HyperCard file format

Caveat Emptor!

Although originally intended by Bill Atkinson, the HyperCard file format has never been officially published. The instructions in this file are simply what was deduced by looking at various existing stacks and their differences.

Warning: The information in this document is not complete enough to allow the creation of new HyperCard stacks, but maybe it can be helpful in reading existing stacks and extracting precious data to keep it from being lost.

Prerequisites

Being a file format from Classic MacOS (shipped 1987 through 2004), many of the data types are from that era. All text is encoded in the MacRoman text encoding, and many flags and data types are from the Quickdraw headers, or based on them. All data is stored in Big-Endian format (like the Motorola 68000 used to do).

The block file layout

A HyperCard stack is a stream of blocks, with a four-character type code and a 4-byte signed ID number, terminated by a 'TAIL' block. Each block has the following basic layout:

4 byteBlock size
including size, type and ID.
4 bytesBlock type
4 bytesBlock ID
4 bytesFiller 0
n bytesData

The Stack

This block is always the first in a HyperCard file. It is present once:

4 byteBlock size
including size, type and ID.
4 bytesBlock type 'STAK'
4 bytesBlock ID, always -1
4 bytesFiller 0
4 bytesFormat, 0: not HyperCard stack, 1-7: pre-release HyperCard 1.x, 8: HyperCard 1.x, 9: pre-release HyperCard 2.x, 10: HyperCard 2.x
4 bytesTotal size of the data fork.
4 bytesSize of the STAK block.
4 bytesUnknown
Small value if the stack is large. Could be a hint of the buffer size to alloc in order to read the stack.
4 bytesMaximum ever of previous value
4 bytesNumber of backgrounds in this stack.
4 bytesID of the first background.
4 bytesNumber of cards in this stack.
4 bytesID of the first card.
4 bytesID of the 'LIST' block in the stack.
4 bytesNumber of FREE blocks.
4 bytesTotal size of all FREE blocks (=the free size of this stack).
4 bytesID of the 'PRNT' block in the stack.
4 bytesHash of the password (not resolved, not the same as ask password hash).
2 bytesUser Level (1 ... 5) for this stack.
2 bytesAlignment short
2 bytesProtection flags, Bit 10: cantPeek, Bit 11: cantAbort, Bit 13: privateAccess, Bit 14: cantDelete, Bit 15: cantModify.
2 bytesAlignment short
16 bytesSkip to offset 0x60
4 bytesHyperCard Version at creation.
Version Format: xx yy zz rr, xx: major, yy: minor, zz: state (80 final, 60 beta, 40 alpha, 20 development), rr: non-release.
For example, 02206044 is version 2.2 beta release 44, and 02418000 is version 2.4.1 final
4 bytesHyperCard Version at last compacting.
See format above
4 bytesHyperCard Version at last modification since last compacting.
See format above
4 bytesHyperCard Version at last modification.
See format above
4 bytesChecksum.
To check it: cast as int[] the STAK block, from byte 0 to 0x600 (until the start of the script). The sum of the ints must be zero.
4 bytesNumber of marked cards.
2 bytesTop of the card window.
2 bytesLeft of the card window.
2 bytesBottom of the card window.
2 bytesRight of the card window.
2 bytesTop of the screen.
2 bytesLeft of the screen.
2 bytesBottom of the screen.
2 bytesRight of the screen.
2 bytesX coordinate of the scroll.
2 bytesY coordinate of the scroll.
2 bytesUnknown (normally zero).
2 bytesUnknown (normally zero).
288 bytesSkip to offset 0x1B0
4 bytesID of the FTBL (font table) block.
4 bytesID of the STBL (style table) block.
2 bytesHeight in pixels of cards in this stack (default if zero: 342).
2 bytesWidth in pixels of cards in this stack (default if zero: 512).
2 bytesUnknown (normally zero).
2 bytesUnknown (normally zero).
256 bytesSkip to offset 0x2C0

Table of patterns (320 bytes). For each each of the 40 patterns:

8 bytesRaw data for an 8x8 bitmap, with one byte representing one row.

Table of FREE blocks (offset 0x400, variable size). For each FREE block:

4 bytesOffset of the FREE block in the file.
4 bytesSize of the FREE block.

VariableSkip to offset 0x600
n bytesStack script as a C string, terminated by a NULL byte.
If this starts with a null, the stack script is a compiled script for an OSA scripting component.

The Master

This block is an index of all the blocks present in the HyperCard file (excluding STAK, MAST, FREE, and TAIL blocks). It is always the second block in the file, present once just after the STAK block:

4 byteBlock size
including size, type and ID.
4 bytesBlock type 'MAST'
4 bytesBlock ID, always -1
4 bytesFiller 0
16 bytesSkip to offset 0x20

In all the remaining bytes (ignore the null entries but loop till the end of the block):

4 bytes24 bits of the integer gives the offset to the block from the start of the stack file, in multiples of 32 bytes. The least significant 8 bits of the integer gives the least significant 8 bits of the block's ID number.

The List

This block contains the list of the cards, it is unique in the file but has no defined position. It is necessary to read it because in memory the card blocks are not written in the right order. To speed up insertions/deletions, the list is segmented in sections called pages.

4 byteBlock size
including size, type and ID.
4 bytesBlock type 'LIST'
4 bytesBlock ID
4 bytesFiller =0
4 bytesNumber of pages
4 bytesSize of a page (normally 0x800)
4 bytesTotal number of card entries in all the pages, should be equal to the number of cards
2 bytesSize of a card entry in the pages
2 bytesUnknown (normally 2)
2 bytesNumber of hash integers in a card entry in the pages, equal to (entry size - 4)/4
2 bytesSearch hash value count
Cf Page. Gives the number of values used in search hashes.
4 bytesChecksum
To compute it:
Checksum: UInt32 = 0
For every page: Checksum = RotateRight3Bits(Checksum + Identifier) + CardCount
4 bytesTotal number of entries in all the pages, should be equal to the number of cards
4 bytesSkip to offset 0x30

For each page:

4 bytesID of 'PAGE' block
2 bytesThe number of cards in the page


 

List Page

A page block contains a section of the card list:

4 bytesBlock size
including size, type and ID.
4 bytesBlock type 'PAGE'
4 bytesBlock ID
4 bytesFiller 0
4 bytesID of the list
4 bytesChecksum
To compute it:
Checksum: UInt32 = 0
For every card in the page: Checksum = RotateRight3Bits(Checksum + Identifier)

For each card (in the list, look for the number of card entries in that page and the size of every card entry):

4 bytesCard ID
1 byteFlags, Bit 4: marked card, Bit 5: has text content, Bit 6: is the start of a background, Bit 7: has a name
VariableText search hash
See format below.


 

Each card entry contains a hash intended to speed up word searches. It is equal to all the hashes of every word in the card contents, bitwise OR'ed together. To check if a word is present, one must simply compute its hash, make a bitwise AND with the card search hash and check if it still has the same value. The hashed words are case insensitive and at least 3 character long.

Within the data devoted to the search hash, the hash starts one bit after the beginning and its length is the greatest prime number that can fit. For example if card entries have a size of 16 bytes, the hash data is 16-4-1=11 byte long (don't count card ID and flag byte), which makes 11*8=88 bits, minus the first one: 87 bits. It is not prime, the greatest prime under 87 is 83, so the last 4 bits are never used.

To compute the hash of a word, the characters are first converted in a case insensitive code, reduced to 32 values on 5 bits. In that code, the letters a-z have the values 1-26. Then the digits 0-4 have the values 27-31. As there is no room left, the digits 5-9 are scattered on the codes of the least used letters "j", "q", "v", "x" and "z". Letters with accents are indexed as letters without accents.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 10 17 22 24 26

Then a given set of values must be computed. After each value is computed, it must be represented in the hash by activating the bit at the index given by the value, modulo the size of the hash. The bits are counted from left to right. In that way the word hash is progressively formed, starting with only clear bits. Here are the values:

  • charCodes[0]*4096 - (charCodes[0]/16)*65535 + charCodes[1]*128 + charCodes[2]*4
  • charCodes[0]*16384 - (charCodes[0]/4)*65535 + charCodes[1]*512 + charCodes[2]*16
  • charCodes[0] + charCodes[1]*2048 + charCodes[2]*64
  • Only if Search Hash Value Count (cf List) is equal to 4:
    charCodes[0]*4 + charCodes[1]*8192 - (charCodes[1]/8)*65535 + charCodes[2]*256
  • For each character code at position i greater or equal to 3:
    charCodes[i] + charCodes[i-1]<<5 + charCodes[i-2]<<10 + charCodes[i-3]<<15

    Note: the word "the" is ignored, and the numbers often are, even with trailing units like in "26k".

    Card

    A card block contains the properties of the card, followed by a list of the parts (buttons and fields mixed), followed by a list of the text contents of the parts (including buttons), followed by the card name, followed by the card script.

    4 bytesBlock size
    including size, type and ID.
    4 bytesBlock type 'CARD'
    4 bytesBlock ID
    4 bytesFiller =0
    4 bytesID of bitmap block storing card picture (0 means transparent)
    2 bytesFlags, Bit 14: cantDelete, Bit 13: not showPict, Bit 11: dontSearch
    2 bytesAlignment short
    8 bytesSkip to offset 0x20
    4 bytesID of the page containing this card
    4 bytesID of background
    2 bytesNumber of parts
    2 bytesID to give to a new part
    4 bytesTotal size of the part list
    2 bytesNumber of part contents
    4 bytesTotal size of the part content list

    For each part:

    2 bytesSize of this part entry
    2 bytesPart ID
    1 byteType: 1: button 2: field
    1 byteFlags, Bit 7: not visible, Bit 5: dontWrap, Bit 4: dontSearch, Bit 3: sharedText, Bit 2: not fixedLineHeight, Bit 1: autoTab, Bit 0: (not enabled)/lockText
    2 bytesTop of part rectangle.
    2 bytesLeft of part rectangle.
    2 bytesBottom of part rectangle.
    2 bytesRight of part rectangle.
    1 byteFlags, Bit 7: showName/autoSelect, Bit 6: highlight/showLines, Bit 5: autoHighlight/wideMargins, Bit 4: (not sharedHighlight)/multipleLines, Bits 3-0: family
    1 byteStyle, 0: transparent, 1: opaque, 2: rectangle, 3: roundRect, 4: shadow, 5: checkBox, 6: radio, 7: scrolling, 8: standard, 9: default, 10: oval, 11: popup.
    2 bytestitleWidth/lastSelectedLine
    2 bytesicon ID/(first)SelectedLine
    2 bytestextAlignment: 0 left (or default?), 1 center, -1 right, (-2 force left align?)
    2 bytestextFont ID (cf Font Block)
    2 bytestext size
    1 byteText style flags, Bit 7: group, Bit 6: extend, Bit 5: condense, Bit 4: shadow, Bit 3: outline, Bit 2: underline, Bit 1: italic, Bit 0: bold
    1 byteFiller =0
    2 bytesline height
    n bytesName, zero-terminated C string.
    1 byteScript indicator =0
    n bytesScript, zero-terminated C string.
    0 ... 1 bytesAlignment byte, if needed.

    For each part content entry:

    2 bytesPart ID
    If this is < 0, this is an entry for a card part, with ID -partID, otherwise for a background part.
    2 bytesLength of the content entry, not counting the part ID and length fields

    Either:

    1 bytePlain text marker =0

    or:

    2 bytesLength of styles
    Highest bit is always set, it must be ignored.

    Style data, containing Length of styles / 4 entries like:

    2 bytesText position
    2 bytesStyle ID (cf Style Block)

    n bytesText
    Not null terminated. For a card button or a background button with sharedHilite = true, this is the button contents. For a background button with sharedHilite = false, this is "1" when hilite = true and empty otherwise.

    n bytesName of the card as a zero-terminated C string.
    n bytesScript of the card as a zero-terminated C string.
    If this starts with something other than a null, the card script is a HyperTalk script. If this starts with a null, the card script is a compiled script for an OSA scripting component.

    Optional OSA script data:

    2 bytesOffset from end of this field to OSA script (jump across header)
    2 bytesLength of OSA script
    n bytesRemainder of header
    n bytesOSA Script


     

    Background

    A background block is similar to a card block except that the header is smaller, with different properties.

    4 bytesBlock size
    including size, type and ID.
    4 bytesBlock type 'BKGD'
    4 bytesBlock ID number
    4 bytesFiller =0
    4 bytesID of bitmap block storing card picture (0 means transparent)
    2 bytesFlags, Bit 14: cantDelete, Bit 13: not showPict, Bit 11: dontSearch
    2 bytesAlignment short
    4 bytesNumber of cards in this background
    4 bytesNext background ID
    4 bytesPrevious background ID
    2 bytesNumber of parts
    ...same as CARD block afterwards

     

    BitMap

    A bitmap stores the picture of a card or a background. It has two layers with one bit per pixel: an image, to tell where the black pixels are, and a mask, to tell where the white pixels are. This is not the classical notion of mask because a pixel activated in the image and not in the mask is black, not transparent. The pixels neither activated in the image and in the mask are transparent.

    The mask and the image both have rectangles where they are enclosed, relative to the card coordinates. Outside the rectangles, the pixels are transparent. The mask and image rectangles are not necessarily in the same place.

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'BMAP'
    4 bytesBlock ID
    4 bytesFiller, =0
    2 bytesUnknown (normally 0)
    2 bytesUnknown (normally 0)
    2 bytesUnknown (normally 1)
    2 bytesUnknown (normally 0)
    2 bytesTop of the card rectangle
    2 bytesLeft of the card rectangle
    2 bytesBottom of the card rectangle
    2 bytesRight of the card rectangle
    2 bytesTop of the mask rectangle
    2 bytesLeft of the mask rectangle
    2 bytesBottom of the mask rectangle
    2 bytesRight of the mask rectangle
    2 bytesTop of the image rectangle
    2 bytesLeft of the image rectangle
    2 bytesBottom of the image rectangle
    2 bytesRight of the image rectangle
    4 bytesUnknown (normally 0)
    4 bytesUnknown (normally 0)
    4 bytesSize of the mask data
    4 bytesSize of the image data
    n bytesMask data
    See format below.
    n bytesImage data
    See format below.

    The mask and image data is stored in a compressed image format that Rebecca Bettencourt christened Wrath of Bill Atkinson, or WOBA, for its tortuous complexity. The left side of the bounding rectangle must be rounded down and the right side rounded up to the nearest multiple of 32 before decompressing the data. The compressed data is a series of instructions of various lengths. The first byte of an instruction indicates how long the instruction is and what it does. The remaining bytes, if any, give data needed for that instruction. Rows can be encoded with a single instruction (with opcodes in the range 0x80-0x87) or a sequence of multiple instructions (with opcodes in the ranges of 0x00-0x7F and 0xC0-0xFF). Some operations change the manner in which rows are decoded (opcodes in the range 0x88-0xBF). The end of a row comes either when the row has been filled (the number of bytes in a row, determined by the mask or image's bounding rect, has been reached) or when an instruction in the range 0x80-0xBF is encountered.

    Mask and image rectangle can be equal to zero if they are absent. If the mask data is provided, the decompressed mask data is used for the mask. If the mask data is not provided but the mask rectangle is provided, the mask rectangle is used for the mask, as if all the pixels in the rectangle were equal to 1. If neither the mask data nor the mask bounding rectangle are provided, there is no mask (which does not mean that the image is transparent).

    Instructions are listed below:

    OpcodeOperationDescription
    0x00-0x7Fdz xx xx xx ...z zero bytes followed by d data bytes
    0x8080 xx xx xx ...one row of uncompressed data
    0x8181one white row
    0x8282one black row
    0x8383 xxone row of a repeated byte of data
    0x8484one row of a repeated byte of data previously used
    Keep an array of eight bytes. Initialize it with 0xAA55AA55AA55AA55. When a 0x83 instruction is encountered, take the row y-coordinate modulo 8, and put the byte into that element of the array. When a 0x84 instruction is encountered, take the row y-coordinate modulo 8, and use that element of the array to fill the row.
    0x8585copy the previous row
    0x8686copy the row before the previous row
    0x8787not used
    0x8888dh = 16, dv = 0
    Initially, dh = 0 and dv = 0. These instructions change dh and dv for every row after the instruction. Every time a row is completed, the following operations are performed: Make a copy of the row. If dh != 0, repeat until the end of the row: Shift the copied row to the right dh bits. XOR the copied row with the original row. If dv != 0, XOR the copied row with the row dv rows back. Copy the row back to the original. Rows compressed with an 0x80-0x87 opcode are not affected.
    0x8989dh = 0, dv = 0
    See above
    0x8A8Adh = 0, dv = 1
    See above
    0x8B8Bdh = 0, dv = 2
    See above
    0x8C8Cdh = 1, dv = 0
    See above
    0x8D8Ddh = 1, dv = 1
    See above
    0x8E8Edh = 2, dv = 2
    See above
    0x8F8Fdh = 8, dv = 0
    See above
    0x90-0x9Fnot used
    0xA0-0xBF101nnnnnrepeat the next instruction n times
    0xC0-0xDF110ddddd xx ...d*8 bytes of data
    0xE0-0xFF111zzzzzz*16 bytes of zero

    The Style Table

    The Style Table stores all the formatting used in the multi-styled text fields of the stack. It appears only once in the file. The styles have the format defined by the old TextEdit API.

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'STBL'
    4 bytesBlock ID
    4 bytesFiller, =0
    4 bytesNumber of styles
    4 bytesStyle ID to use for next style

    For each style:

    4 bytesStyle ID
    2 bytesUnknown (normally 0)
    2 bytesUnknown (normally 1)
    2 bytesNot used (line height)
    2 bytesNot used (font ascent)
    2 bytesFont ID
    -1 if same as containing field.
    2 bytesStyle flags, Bit 15: group, Bit 14: extend, Bit 13: condense, Bit 12: shadow, Bit 11: outline, Bit 10: underline, Bit 9: italic, Bit 8: bold
    -1 if same as containing field.
    2 bytesFont Size
    -1 if same as containing field.
    2 bytesNot used (red)
    2 bytesNot used (green)
    2 bytesNot used (blue)

    The Font Table

    Since font IDs were not consistent across the installations, HyperCard stores a table of the names of the fonts used in the stack. This block appears only once in a file.

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'FTBL'
    4 bytesBlock ID
    4 bytesFiller, =0
    4 bytesNumber of fonts
    4 bytesUnknown (normally 0)

    For each font:

    2 bytesFont ID
    n bytesName of the font, null terminated
    0...1 bytesAlignment byte

    The Print Setting

    This blocks contains the HyperCard print settings and template indexes. It only appears once in the file.

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'PRNT'
    4 bytesBlock ID
    4 bytesFiller, =0
    32 bytesSkip to offset 0x30 (unknown format in-between)
    2 bytesPage Set-Up (PRST block) ID
    258 bytesSkip to offset 0x134 (unknown format in-between)
    2 bytesNumber of report templates

    For each template:

    4 bytesTemplate ID
    1 byteTemplate name length
    n bytesTemplate name
    (36 - name length - 5) bytesAlignment bytes, every template entry is 36 byte long

    The Page Set-Up

    This block is the Mac OS print setting. It is the same structure as is documented in Inside Macintosh: Imaging with QuickDraw: Printing Manager, except with the HyperCard block header attached to it.

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'PRST'
    4 bytesBlock ID
    4 bytesFiller, =0
    2 bytesPrinting Manager version that initialized this record
    2 bytesUsed internally by Printing Manager
    2 bytesVertical resolution (dots per inch)
    2 bytesHorizontal resolution (dots per inch)
    2 bytesPrintable page top, always 0
    The rectangle is counted from the top-left of the printable page
    2 bytesPrintable page left, always 0
    The rectangle is counted from the top-left of the printable page
    2 bytesPrintable page bottom
    The rectangle is counted from the top-left of the printable page
    2 bytesPrintable page right
    The rectangle is counted from the top-left of the printable page
    2 bytesPaper top
    The rectangle is counted from the top-left of the printable page
    2 bytesPaper left
    The rectangle is counted from the top-left of the printable page
    2 bytesPaper bottom
    The rectangle is counted from the top-left of the printable page
    2 bytesPaper right
    The rectangle is counted from the top-left of the printable page
    2 bytesPrinter device number
    2 bytesPageV
    2 bytesPageH
    1 bytePort
    1 byteFeed type: continuous or separate sheets
    2 bytesUsed internally by Printing Manager
    2 bytesVertical Resolution
    2 bytesHorizontal Resolution
    2 bytesPage top
    2 bytesPage left
    2 bytesPage bottom
    2 bytesPage right
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    2 bytesFirst page
    2 bytesLast page
    2 bytesNumber of copies
    1 bytePrinting method, 0: draft, 1: deferred
    1 byteUsed internally by Printing Manager
    4 bytesBackground procedure pointer, should be 0
    4 bytesSpool file name
    2 bytesSpool file volume
    1 byteSpool file version
    1 byteReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved
    4 bytesReserved

    Report template

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'PRNT'
    4 bytesBlock ID
    4 bytesFiller, =0
    1 byteUnit, 0: centimeters, 1: millimeters, 2: inches, 3: points/pixels
    1 byteAlignment to 16 bits
    2 bytesMargin top
    2 bytesMargin left
    2 bytesMargin bottom
    2 bytesMargin right
    2 bytesSpacing height
    2 bytesSpacing width
    2 bytesCell height
    2 bytesCell width
    2 bytesFlags, Bit 8: left to right (as opposed to top to bottom), Bit 0: dynamic height
    1 byteHeader length
    n bytesHeader
    The following control characters can be embedded in the header string: 0x01 (control-A): date, 0x02 (control-B): time, 0x03 (control-C): stack name, 0x04 (control-D): page number
    2 bytesNumber of report items

    For each report item:

    2 bytesSize of item, including this field
    2 byteTop
    2 byteLeft
    2 byteBottom
    2 byteRight
    2 byteColumn count
    2 byteFlags, Bit 13: change height, Bit 12: change style, Bit 11: change size, Bit 10: change font, Bit 4: invert, Bit 3: right frame, Bit 2: bottom frame, Bit 1: left frame, Bit 0: top frame
    2 byteText size
    2 byteText height
    2 bytesText style, Bit 15: group, Bit 14: extend, Bit 13: condense, Bit 12: shadow, Bit 11: outline, Bit 10: underline, Bit 9: italic, Bit 8: bold
    2 byteText align, 0: left, 1: center, -1: right
    n byteContents, null terminated
    n byteText font, null terminated
    0..1 byteAlignment byte

    The Tail

    This block is always at the end of a HyperCard file.

    4 byteBlock size
    including size, type and ID.
    4 bytesBlock type 'TAIL'
    4 bytesBlock ID, always -1
    4 bytesFiller, =0
    1 byteLength of the tail string
    n bytesTail string (without ending null)
    For HyperCard 1.x: "That's all folks..."
    For HyperCard 2.x: "Nu är det slut…"

    If you have found out more about this file format, feel free to amend this document and let us know.
     


    Written up by Mister Z and Pierre Lorenzi.