Throughout the documentation, you will see many references to Images. We'll go over what Images are, how they are used, and the various options associated with them here.
First we need to provide a short overview of how plugins work.
When you invoke the following command:
The following is (roughly) what happens:
protoccompiles the file
foo.proto(and any imports) and internally produces a FileDescriptorSet, which is just a list of FileDescriptorProto messages. These messages contain all information about your
.protofiles, including optionally source code information such as the start/end line/column of each element of your
.protofile, as well as associated comments.
- The FileDescriptorSet is turned into a CodeGeneratorRequest,
which contains the FileDescriptorProtos that
foo.protoand any imports, a list of the files specified (just
foo.protoin this example), as well as any options provided after the
protocthen looks for a binary named
protoc-gen-go, and invokes it, giving the serialized CodeGeneratorRequest as stdin.
protoc-gen-goruns, and either errors or produces a CodeGeneratorResponse, which specifies what files are to be generated and their content. The serialized CodeGeneratorResponse is written to stdout of
- On success of
protocreads stdout and then writes these generated files.
The builtin generators to
--cpp_out, etc, work in roughly
the same manner, although instead of executing an external binary, this is done internally
FileDescriptorSets are the primitive used throughout the Protobuf ecosystem to represent a compiled Protobuf schema. They are also the primary artifact that protoc produces.
That is to say that everything you do with
protoc, and any plugins you use, talk in terms of FileDescriptorSets. Of note, they are how gRPC
Reflection works under the hood
protoc provides the
--descriptor_set_out flag, aliased as
-o, to allow writing serialized
FileDescriptorSets. For example, given a single file
foo.proto, you can write a FileDescriptorSet to
stdout as follows:
The resulting FileDescriptorSet will contain a single FileDescriptorProto with name
By default, FileDescriptorSets will not include any imports not specified on the command line,
and will not include source code information. Source code information is useful for generating
documentation inside your generated stubs, and for things like linters and breaking change
detectors. As an example, assume
bar.proto. To produce a FileDescriptorSet
that includes both
bar.proto, as well as source code information:
An Image is Buf's custom extension to FileDescriptorSets. The actual definition is currently stored in bufbuild/buf as of this writing.
Images are FileDescriptorSets, and FileDescriptorSets are Images. Due to the forwards and backwards compatible nature of Protobuf, we're able to add an additional field to FileDescriptorSet while maintaining compatibility in both directions - existing Protobuf plugins will just drop this field, and Buf does not require this field to be set to work with Images.
Images are the primitive of Buf. As a result, FileDescriptorSets are also the primitive of Buf.
Linting and breaking change detection internally operate on Images that Buf either produces on the fly, or reads from an external location. They represent a stable, widely-used method to represent a compiled Protobuf schema. For the breaking change detector, Images are the storage format used if you want to manually store the state of your Protobuf schema. See the breaking change documentation for more details.
We use the ImageExtension of an Image to store additional information that is useful to Buf to perform it's operations. Currently, the only additional information stored is the indexes within the file array of the FileDescriptorProtos that are imports.
Right now, the only possible imports are the Well-Known Types. All other files are
specified through your build configuration, but it is always possible to include
the Well-Known Types in your
.proto files with Buf, and is usually possible to
include the Well-Known Types with
protoc in a standard installation. It's widely
accepted that a Protobuf compiler should always provide these.
Currently, we use this information in the linter and breaking change detector. For the linter, we do not want to lint imports - they are not part of your Protobuf schema that you care about for linting. The linter filters any imports before running the lint rules. If the ImageExtension field is not present, Buf cannot deduce what FileDescriptorProtos are imports, and lints everything.
For the breaking change detector, we check imports by default, however you can
exclude imports with the
--exclude-imports flag. As with the linter, if the
ImageExtension field is not present, Buf does not know what an import is, so
--exclude-imports is a no-op.
Images are created using
buf build. Given that you are in the root
of your repository, and you have a proper configuration:
The resulting image is written to the file
image.bin. Of note, the ordering of
the FileDescriptorProtos is carefully written to mimic the ordering that
would produce, for both the cases where imports are and are not written.
By default, Buf produces an Image with both imports and source code info. You can strip each of these:
In general, we do not recommend stripping these, as this information can be useful for various operations. However, source code info specifically takes a lot of additional space, generally in the region of 5x as much space, so if you know you do not need this data, it can be useful to strip source code info.
Images can be outputted in one of two formats:
Either format can be compressed using Gzip or Zstandard.
Per the Inputs documentation,
buf build can deduce the format
by the file extension:
The special value
- is used to denote stdout. You can manually set the format. For example:
When combined with jq, this also allows for introspection. For example, to see a list of all packages:
Images always include the ImageExtension field. However, if you want a pure FileDescriptorSet
without this field set, to mimic
The ImageExtension field will not affect Protobuf plugins or any other operations, they will merely see this as an unknown field. However, we provide the option in case you want it.
Since Buf's primitive is the Image, and FileDescriptorSets are Images, we're able to easily
protoc output to be
buf input. As an example for lint:
We discuss this further in the relevant sections of our documentation.