Rust

Rust AST

Examples of TokenStreams and DerivedInput objects in Rust using Syn crate.

jroddev

06 Sep 2022 — 4 min read

Rust Compilation Process

The rust compilation process looks roughly like this. A lexical parser does the tokenization, then rustc goes through a couple of intermediary representations that do type solving and optimizations. Finally the IR is passed onto LLVM to generate output binaries.

More rustc details here

Macros

When you start working with Rust Macros you will likely need to interact with AST (abstract syntax tree) objects. You can use the Syn crate to parse an incoming TokenStream into a DerivedInput and then the Quote crate to turn it back into a TokenStream. These crates pass around TokenStream from Proc_Macro2 crate which adds extra functionality while we're still operating on the AST.

In this post I show two types (a struct and an enum) as well as their TokenStream and DeriveInput information. You can convert between the two like this:

let ast: &syn::DeriveInput = syn::parse(input).unwrap();

Struct

 struct Vector3 {
	x: f32,
	y: f32,
    	z: f32
}

Vector3 struct expands to the below Derived Input. Ident members give us access to the names of things struct, Vector3, x, y, z.

It's also worth noting that structs use Brace { and } and can be seen as delimiter: Brace. Tuples are similar but instead use delimiter: Parenthesis ( and ).

TokenStream [
    Ident { 
        ident: "struct", 
        span: #0 bytes(143..149)
    }, 
    Ident { 
        ident: "Vector3", 
        span: #0 bytes(150..157)
    }, 
    Group { 
        delimiter: Brace, 
        stream: TokenStream [
            Ident { 
                ident: "x", 
                span: #0 bytes(164..165)
            }, 
            Punct { 
                ch: ':', 
                spacing: Alone, 
                span: #0 bytes(165..166)
            }, Ident { 
                ident: "f32", 
                span: #0 bytes(167..170)
            }, 
            Punct { 
                ch: ',', 
                spacing: Alone, 
                span: #0 bytes(170..171)
            }, 
            Ident { 
                ident: "y", 
                span: #0 bytes(176..177)
            }, 
            Punct { 
                ch: ':', 
                spacing: Alone, 
                span: #0 bytes(177..178)
            }, 
            Ident { 
                ident: "f32", 
                span: #0 bytes(179..182)
            }, 
            Punct { 
                ch: ',', 
                spacing: Alone, 
                span: #0 bytes(182..183)
            }, 
            Ident { 
                ident: "z", 
                span: #0 bytes(188..189)
            }, 
            Punct { 
                ch: ':', 
                spacing: Alone, 
                span: #0 bytes(189..190)
            }, 
            Ident { 
                ident: "f32", 
                span: #0 bytes(191..194)
            }
        ], 
        span: #0 bytes(158..196)
    }
]

TokenStream

DeriveInput { 
    attrs: [], 
    vis: Inherited, 
    ident: Ident { 
        ident: "Vector3", 
        span: #0 bytes(150..157)
    }, 
    generics: Generics { 
        lt_token: None, 
        params: [], 
        gt_token: None, 
        where_clause: None
    }, 
    data: Struct(DataStruct { 
        struct_token: Struct, 
        fields: Named(FieldsNamed { 
            brace_token: Brace, 
            named: [
                Field { 
                    attrs: [], 
                    vis: Inherited, 
                    ident: Some(Ident { 
                        ident: "x", 
                        span: #0 bytes(164..165)
                    }), 
                    colon_token: Some(Colon), 
                    ty: Path(TypePath { 
                        qself: None, 
                        path: Path { 
                            leading_colon: None, 
                            segments: [
                                PathSegment { 
                                    ident: Ident {
                                         ident: "f32", 
                                         span: #0 bytes(167..170)
                                    }, 
                                    arguments: None
                                }
                            ]
                        }
                    })
                }, 
                Comma, 
                Field { 
                    attrs: [], 
                    vis: Inherited, 
                    ident: Some(Ident { 
                        ident: "y", 
                        span: #0 bytes(176..177)
                    }), 
                    colon_token: Some(Colon), 
                    ty: Path(TypePath { 
                        qself: None, 
                        path: Path { 
                            leading_colon: None, 
                            segments: [
                                PathSegment { 
                                    ident: Ident { 
                                        ident: "f32", 
                                        span: #0 bytes(179..182)
                                    }, 
                                    arguments: None
                                }
                            ]
                        }
                    })
                }, 
                Comma, 
                Field { 
                    attrs: [], 
                    vis: Inherited, 
                    ident: Some(Ident { 
                        ident: "z", 
                        span: #0 bytes(188..189)
                    }), 
                    colon_token: Some(Colon), 
                    ty: Path(TypePath { 
                        qself: None, 
                        path: Path { 
                            leading_colon: None, 
                            segments: [
                                PathSegment { 
                                    ident: Ident { 
                                        ident: "f32", 
                                        span: #0 bytes(191..194)
                                    }, 
                                    arguments: None
                                }
                            ]
                        }
                    })
                }
            ]
        }), 
        semi_token: None
    })
}

DeriveInput

Enum

enum MyEnum {
    VariantA,
    VariantB(i32),
    VariantC{x: i32, y: i32, z:i32},
    VariantD(f32, f32)
}

This one expands quite a bit, but the key piece of information here is data.variants which provides access to each of the Enums Variants.

TokenStream [
    Ident { 
        ident: "enum", 
        span: #0 bytes(875..879)
        }, 
        Ident { 
            ident: "MyEnum", 
            span: #0 bytes(880..886)
        }, 
        Group { 
            delimiter: Brace, 
            stream: TokenStream [
                Ident { 
                    ident: "VariantA", 
                    span: #0 bytes(893..901)
                }, 
                Punct { 
                    ch: ',', 
                    spacing: Alone, 
                    span: #0 bytes(901..902)
                }, 
                Ident { 
                    ident: "VariantB", 
                    span: #0 bytes(907..915)
                }, 
                Group { 
                    delimiter: Parenthesis, 
                    stream: TokenStream [
                        Ident { 
                            ident: "i32", 
                            span: #0 bytes(916..919)
                        }
                    ], 
                    span: #0 bytes(915..920)
                }, 
                Punct { 
                    ch: ',', 
                    spacing: Alone, 
                    span: #0 bytes(920..921)
                }, 
                Ident { 
                    ident: "VariantC", 
                    span: #0 bytes(926..934)
                }, 
                Group { 
                    delimiter: Brace, 
                    stream: TokenStream [
                        Ident { 
                            ident: "x", 
                            span: #0 bytes(935..936)
                        }, 
                        Punct { 
                            ch: ':', 
                            spacing: Alone, 
                            span: #0 bytes(936..937)
                        }, 
                        Ident { 
                            ident: "i32",
                            span: #0 bytes(938..941)
                        }, 
                        Punct { 
                            ch: ',', 
                            spacing: Alone, 
                            span: #0 bytes(941..942)
                        }, 
                        Ident { 
                            ident: "y", 
                            span: #0 bytes(943..944)
                        }, 
                        Punct { 
                            ch: ':', 
                            spacing: Alone, 
                            span: #0 bytes(944..945)
                        }, 
                        Ident { 
                            ident: "i32", 
                            span: #0 bytes(946..949)
                        }, 
                        Punct { 
                            ch: ',', 
                            spacing: Alone, 
                            span: #0 bytes(949..950)
                        }, 
                        Ident { 
                            ident: "z", 
                            span: #0 bytes(951..952)
                        }, 
                        Punct { 
                            ch: ':', 
                            spacing: Alone, 
                            span: #0 bytes(952..953)
                        }, 
                        Ident { 
                            ident: "i32", 
                            span: #0 bytes(953..956)
                        }
                    ], 
                    span: #0 bytes(934..957)
                },
                Punct { 
                    ch: ',', 
                    spacing: Alone, 
                    span: #0 bytes(957..958)
                }, 
                Ident { 
                    ident: "VariantD", 
                    span: #0 bytes(963..971)
                }, 
                Group { 
                    delimiter: Parenthesis, 
                    stream: TokenStream [
                        Ident { 
                            ident: "f32", 
                            span: #0 bytes(972..975)
                        }, 
                        Punct { 
                            ch: ',', 
                            spacing: Alone, 
                            span: #0 bytes(975..976)
                        }, 
                        Ident { 
                            ident: "f32", 
                            span: #0 bytes(977..980)
                        }
                    ], 
                    span: #0 bytes(971..981)
                }
            ], 
            span: #0 bytes(887..983)
    }
]

TokenStream

DeriveInput { 
    attrs: [], 
    vis: Inherited, 
    ident: Ident { 
        ident: "MyEnum", 
        span: #0 bytes(880..886)
    }, 
    generics: Generics { 
        lt_token: None, 
        params: [], 
        gt_token: None, 
        where_clause: None
    }, 
    data: Enum(DataEnum { 
        enum_token: Enum, 
        brace_token: Brace, 
        variants: [
            Variant { 
                attrs: [], 
                ident: Ident { 
                    ident: "VariantA", 
                    span: #0 bytes(893..901)
                }, 
                fields: Unit, 
                discriminant: None
            }, 
            Comma, 
            Variant { 
                attrs: [], 
                ident: Ident { 
                    ident: "VariantB",
                     span: #0 bytes(907..915)
                }, 
                fields: Unnamed(FieldsUnnamed { 
                    paren_token: Paren, 
                    unnamed: [
                        Field { 
                            attrs: [], 
                            vis: Inherited, 
                            ident: None, 
                            colon_token: None, 
                            ty: Path(TypePath { 
                                qself: None, 
                                path: Path { 
                                    leading_colon: None, 
                                    segments: [
                                        PathSegment { 
                                            ident: Ident { 
                                                ident: "i32", 
                                                span: #0 bytes(916..919)
                                            }, 
                                            arguments: None
                                        }
                                    ]
                                }
                            })
                        }
                    ]
                }), 
                discriminant: None
            }, 
            Comma, 
            Variant { 
                attrs: [], 
                ident: Ident { 
                    ident: "VariantC", 
                    span: #0 bytes(926..934)
                }, 
                fields: Named(FieldsNamed { 
                    brace_token: Brace, 
                    named: [
                        Field { 
                            attrs: [], 
                            vis: Inherited, 
                            ident: Some(Ident { 
                                ident: "x", 
                                span: #0 bytes(935..936)
                            }), 
                            colon_token: Some(Colon), 
                            ty: Path(TypePath { 
                                qself: None, 
                                path: Path { 
                                    leading_colon: None, 
                                    segments: [
                                        PathSegment { 
                                            ident: Ident { 
                                                ident: "i32", 
                                                span: #0 bytes(938..941)
                                            }, 
                                            arguments: None
                                        }
                                    ]
                                }
                            })
                        }, 
                        Comma, 
                        Field { 
                            attrs: [], 
                            vis: Inherited, 
                            ident: Some(Ident { 
                                ident: "y", 
                                span: #0 bytes(943..944)
                            }), 
                            colon_token: Some(Colon), 
                            ty: Path(TypePath { 
                                qself: None, 
                                path: Path { 
                                    leading_colon: None, 
                                    segments: [
                                        PathSegment { 
                                            ident: Ident { 
                                                ident: "i32", 
                                                span: #0 bytes(946..949)
                                            }, 
                                            arguments: None
                                        }
                                    ]
                                }
                            })
                        }, 
                        Comma, 
                        Field { 
                            attrs: [], 
                            vis: Inherited, 
                            ident: Some(Ident { 
                                ident: "z", 
                                span: #0 bytes(951..952)
                            }), 
                            colon_token: Some(Colon), 
                            ty: Path(TypePath { 
                                qself: None, 
                                path: Path { 
                                    leading_colon: None, 
                                    segments: [
                                        PathSegment { 
                                            ident: Ident { 
                                                ident: "i32", 
                                                span: #0 bytes(953..956)
                                            }, 
                                            arguments: None
                                        }
                                    ]
                                }
                            })
                        }
                    ]
                }), 
                discriminant: None
            }, 
            Comma, 
            Variant { 
                attrs: [], 
                ident: Ident { 
                    ident: "VariantD", 
                    span: #0 bytes(963..971)
                }, 
                fields: Unnamed(FieldsUnnamed { 
                    paren_token: Paren, 
                    unnamed: [
                        Field { 
                            attrs: [], 
                            vis: Inherited, 
                            ident: None, 
                            colon_token: None, 
                            ty: Path(TypePath { 
                                qself: None, 
                                path: Path { 
                                    leading_colon: None, 
                                    segments: [
                                        PathSegment { 
                                            ident: Ident { 
                                                ident: "f32", 
                                                span: #0 bytes(972..975)
                                            }, 
                                            arguments: None
                                        }
                                    ]
                                }
                            })
                        }, 
                        Comma, 
                        Field { 
                            attrs: [], 
                            vis: Inherited, 
                            ident: None, 
                            colon_token: None, 
                            ty: Path(TypePath { 
                                qself: None, 
                                path: Path { 
                                    leading_colon: None, 
                                    segments: [PathSegment { 
                                        ident: Ident { 
                                            ident: "f32", 
                                            span: #0 bytes(977..980)
                                            }, 
                                            arguments: None
                                        }
                                    ]
                                }
                            })
                        }
                    ]
                }), 
                discriminant: None
            }
        ]
    })
}

DerivedInput