API Implementation
Service Client
Service Methods
C doesn’t have object oriented programming built in, but most programs end up implementing some kind of ad-hoc object model.
Let’s consider the object used to represent a widget in the fictional Azure Widgets service. Such a structure could be defined like this:
If you are defining an opaque type then the definition is placed in a .c
file and the header file contains the following:
If you need to expose the type (for stack allocation), but would like to make it clear that some fields are private and should not be modified, place the type in the header file as follows::
⛔️ DO NOT hide the members of a struct that supports stack allocation, except in the above way. This can result in alignment problems and missed optimization opportunities.
Initialization and destruction
You must always have an initialization function. This function will take a block of allocated memory of the correct size and alignment and turn it into a valid object instance, setting fields to default values and processing initialization parameters.
✅ DO name initialization functions with the form az_<libname>_init_...
.
✅ DO ensure that the object is “ready to use” after a call to the init function for the object.
If there is more than one way to initialize an object, you should define multiple initialization functions with different names. For example:
If initialization could fail (for example, during parameter validation), ensure the init function returns an az_result
to indicate error conditions.
A possible implementation of these initialization functions would be:
Similarly to allocation, a type can have a destruction function. However only types that own a resource (such as memory), or require special cleanup (like securely zeroing their memory) need a destruction function.
☑️ YOU SHOULD prefer non-allocating types and methods.
Allocation and deallocation
Your library should not allocate memory if possible. It should be possible to use the client library without any allocations, deferring all allocations to the client program.
Allocation should be separated from initialization, unless there’s an extremely good reason to tie them together. In general we want to let the user allocate their own memory. You only need an allocation function if you intend to hide the size and alignment of the object from the user.
✅ DO name the allocation and deallocation functions as az_<libname>_(de)allocate_<objtype>
.
Note that this is the opposite of the pattern for other methods. Allocation functions do not operate on a value of <objtype>
. Rather they create and destroy such values.
✅ DO provide an allocation and deallocation function for opaque types.
☑️ YOU SHOULD take a set of allocation callbacks as a parameter to the allocation and deallocation functions for an opaque type. Use the library default allocation functions if the allocation callback parameter is NULL.
⚠️ YOU SHOULD NOT store a pointer to the allocation callbacks inside the memory returned by the allocation function. You may store such a pointer for debugging purposes.
The intent is to allow storing a pointer to the allocation callbacks to ensure the same set of callbacks is used for allocation and deallocation. However, it is not allowed to change the ABI of the returned object to do this. You need to store the callbacks before or after the pointer returned to the caller.
TODO: Rationalize this - do we store or not store?
⛔️ DO NOT return any errors from the deallocation function. It is impossible to write leak-free programs if deallocation and cleanup functions can fail.
For example:
Initialization for objects that allocate
The initialization function should take a set of allocation callbacks and store them inside the object instance.
✅ DO take a set of allocation / deallocation callbacks in the init
function of objects owning inner pointers.
⚠️ YOU SHOULD NOT allocate different inner pointers with different sets of allocation callbacks. Use a single allocation callback.
Destruction for objects that allocate
✅ DO name destruction functions az_<libname>_<objtype>_destroy
.
⛔️ DO NOT take allocation callbacks in the destruction function.
The reason one would take an allocation callback parameter in the destruction function is to save space by not storing it in the object instance. The reason we prohibit this is that it means an object that owns a pointer to another object must then take two allocation parameters in its destroy function.
⛔️ DO NOT return any errors in the destruction function. It’s impossible to write leak-free programs if deallocation / cleanup functions can fail.
The destruction function should follow this pattern:
The following is a possible implementation of a destruction function for the widgets client object:
Methods on objects
To define a method on an object simply define a function taking a pointer to that object as its first parameter. For example:
✅ DO use @memberof
to indicate a function is associated with a class.
✅ DO provide the class object as the first parameter to a function associated with the class.
Functions with many parameters
Sometimes a function will take a large number of parameters, many of which have sane defaults. In this case you should pass them via a struct. Default arguments should be represented by “zero”. If the function is a method then the first parameter should still be a pointer to the object type the method is associated with.
For example the previous az_widgets_client_init_with_sides
function could be defined instead as:
and would be called from user code like:
Note that the num_sides
and sides
parameters are left as default.
If the params
parameter is NULL
then the az_widgets_client_init_with_sides
function should assume the defaults for all parameters.
If a function takes both optional and non-optional parameters then prefer passing the non-optional ones as parameters and the optional ones by struct.
✅ DO use a struct to encapsulate parameters if the number of parameters is greater than 5.
⛔️ DO NOT include the class object in the encapsulating paramter struct.
Methods requiring allocation
If a method could require allocating memory then it should use the most relevant set of allocation callbacks. For example the az_widgets_client_add_side
method may need to allocate or re-allocate the array of sides. It should use the az_widgets_client
allocators. On the other hand the method:
would likely use the allocation callbacks inside the client->str
structure.
TODO: Rationalize advice here. Earlier, we said don’t include allocators inside the structure.
Callbacks
Callback functions should be defined to take a pointer to the “sender” object as the first argument and a void pointer to user data as the last argument. Any additional arguments, if any, should be contextual data needed by the callback. For example say we had an object az_widgets_client
that could make requests to be handled in a callback and we represent the response as an object az_response
. We might define the following:
Client code would use this in the following manner:
✅ DO include a user_data
parameter on all callbacks.
⛔️ DO NOT de-reference the user data pointer from inside the library.
Discriminated Unions
Discriminated unions can be useful for grouping information in a struct. However, C does not provide a standard way of defining discriminated unions. Use the following:
This syntax is supported on all C99 compilers as it adheres to strict C99 syntax. Access the inner member using union_value.value.the_int
(as an example).
The nested enum and union should never have a tag name
as this is always an extension. It is a user error to access the union without checking its tag first.
Service Method Parameters
Parameter Validation
TODO: This is duplicated from the design.md.
The service client will have several methods that perform requests on the service. Service parameters are directly passed across the wire to an Azure service. Client parameters are not passed directly to the service, but used within the client library to fulfill the request. Examples of client parameters include values that are used to construct a URI, or a file that needs to be uploaded to storage.
✅ DO validate client parameters.
⛔️ DO NOT validate service parameters. This includes null checks, empty strings, and other common validating conditions. Let the service validate any request parameters.
✅ DO validate the developer experience when the service parameters are invalid to ensure appropriate error messages are generated by the service. If the developer experience is compromised due to service-side error messages, work with the service team to correct prior to release.
Functions with many parameters
Sometimes a function will take a large number of parameters, many of which have sane defaults. In this case you should pass them via a struct. Default arguments should be represented by “zero”. If the function is a method then the first parameter should still be a pointer to the object type the method is associated with.
For example the previous az_widgets_client_init_with_sides
function could be defined instead as:
and would be called from user code like:
Note that the num_sides
and sides
parameters are left as default.
If the params
parameter is NULL
then the az_widgets_client_init_with_sides
function should assume the defaults for all parameters.
If a function takes both optional and non-optional parameters then prefer passing the non-optional ones as parameters and the optional ones by struct.
✅ DO use a struct to encapsulate parameters if the number of parameters is greater than 5.
⛔️ DO NOT include the class object in the encapsulating paramter struct.
Supporting Types
Serialization
Requests to the service fall into two basic groups - methods that make a single logical request, or a deterministic sequence of requests. An example of a single logical request is a request that may be retried inside the operation. An example of a deterministic sequence of requests is a paged operation.
The logical entity is a protocol neutral representation of a response. For HTTP, the logical entity may combine data from headers, body and the status line. A common example is exposing an ETag header as a property on the logical entity in addition to any deserialized content from the body.
✅ DO optimize for returning the logical entity for a given request. The logical entity MUST represent the information needed in the 99%+ case.
✅ DO make it possible for a developer to get access to the complete response, including the status line, headers and body. The client library MUST follow the language specific guidance for accomplishing this.
For example, you may choose to do something similar to the following:
✅ DO document and provide examples on how to access the raw and streamed response for a given request, where exposed by the client library. We do not expect all methods to expose a streamed response.
For methods that combine multiple requests into a single call:
⛔️ DO NOT return headers and other per-request metadata unless it is obvious as to which specific HTTP request the methods return value corresponds to.
✅ DO provide enough information in failure cases for an application to take appropriate corrective action.
Enumeration-like Structures
TODO: This section is not applicable for the Embedded C SDK.
SDK Feature Implementation
Configuration
TODO: Add discussion on configuration environment variables to parallel that of other languages
Logging
Client libraries must support robust logging mechanisms so that the consumer can adequately diagnose issues and quickly determine whether the issue is in the consumer code, client library code, or service.
In general, our advice to consumers of these libraries is to establish logging in their preferred manner at the WARNING
level or above in production to capture problems with the application, and this level should be enough for customer support situations. Informational or verbose logging can be enabled on a case-by-case basis to assist with issue resolution.
✅ DO use the Azure Core library for logging.
TODO: The Azure Core logging library does not exist yet.
✅ DO support pluggable log handlers.
✅ DO make it easy for a consumer to enable logging output to the console. The specific steps required to enable logging to the console must be documented.
✅ DO use one of the following log levels when emitting logs: Verbose
(details), Informational
(things happened), Warning
(might be a problem or not), and Error
.
✅ DO use the Error
logging level for failures that the application is unlikely to recover from (out of memory, etc.).
✅ DO use the Warning
logging level when a function fails to perform its intended task. This generally means that the function will raise an exception. Do not include occurrences of self-healing events (for example, when a request will be automatically retried).
✔️ YOU MAY log the request and response (see below) at the Warning
when a request/response cycle (to the start of the response body) exceeds a service-defined threshold. The threshold should be chosen to minimize false-positives and identify service issues.
✅ DO use the Informational
logging level when a function operates normally.
✅ DO use the Verbose
logging level for detailed troubleshooting scenarios. This is primarily intended for developers or system administrators to diagnose specific failures.
⛔️ DO NOT log payloads or HTTP header/query parameter values that aren’t on the service provided allow list. For header/query parameters not on the allow list use the value <REDACTED>
in place of the real value.
✅ DO log request line and headers as an Informational
message. The log should include the following information:
- The HTTP method.
- The URL.
- The query parameters (redacted if not in the allow-list).
- The request headers (redacted if not in the allow-list).
- An SDK provided request ID for correlation purposes.
- The number of times this request has been attempted.
✅ DO log response line and headers as an Informational
message. The format of the log should be the following:
- The SDK provided request ID (see above).
- The status code.
- Any message provided with the status code.
- The response headers (redacted if not in the allow-list).
- The time period between the first attempt of the request and the first byte of the body.
✅ DO log an Informational
message if a service call is cancelled. The log should include:
- The SDK provided request ID (see above).
- The reason for the cancellation (if available).
✅ DO log exceptions thrown as a Warning
level message. If the log level set to Verbose
, append stack trace information to the message.
Distributed Tracing
TODO: Implement the spirit of the general guidelines for distributed tracing.
TODO: Distributed Tracing is explicitly removed?
Telemetry
TODO: Add details about how telemetry is not supported in the Embedded C SDK.
Testing
We believe testing is a part of the development process, so we expect unit and integration tests to be a part of the source code. All components must be covered by automated testing, and developers should strive to test corner cases and main flows for every use case.
All code should contain, at least, requirements, unit tests, end-to-end tests, and samples. The requirements description should be placed in the unit test file, on top of the test function that verifies the requirement. The unit test name should be placed in the code as a comment, together with the code that implements that functionality. For example:
*API source code file:*
*Unit test file:*
If a single unit test tests more than one requirement, it should be sequentially enumerated in the unit test file, and the same number should be added to the test name in the code comment. For example:
*API source code file:*
*Unit test file:*
⛔️ DO NOT have any memory leaks. Run samples and unit tests with valgrind. Unit tests and e2e tests are valgrind verified at the gate.
✅ DO unit test your API with ccputest, a unit testing and mocking framework for C and C++.
✅ DO automatically run unit tests when building your client library; i.e. make unit tests part of your continuous integration (CI)
✅ DO maintain a minimum 80% code coverage with unit tests.
Tooling
We use a common build and test pipeline to provide for automatic distribution of client libraries. To support this, we need common tooling.
✅ DO use CMake v3.7 for your project build system.
Version 3.7 is the minimum version installed on the Azure Pipelines Microsoft hosted agents (https://docs.microsoft.com/azure/devops/pipelines/agents/hosted)
✅ DO include the following standard targets:
build
to build the librarytest
to run the unit test suitedocs
to generate reference documentationall
to run all three targets
Include other targets as they appear useful during the development process.
☑️ YOU SHOULD Minimize build variants. In particular do not add build options that change the client library ABI or API.
TODO: Should we advise using valgrind, cppcheck, or other analysis tools (static or dynamic)?
✅ DO use hidden visibility when building dynamic libraries. For CMake:
This allows you to use an export macro to export symbols. For example:
CMake will automatically generate an appropriate export header:
✅ DO use clang-format for formatting, with the following command-line options:
Using -i
does an in-place edit of the files for style. There is a Visual Studio extension that binds Ctrl-R Ctrl-F to this operation. Visual Studio 2019 includes this functionality by default.
TODO: Decide on exact formatting standards to use and include here.
✅ DO generate API documentation with doxygen
.
For example in CMake:
Notice that:
- We use
find_package()
to find doxygen - We use the
DOXYGEN_<PREF>
CMake variables instead of writing your owndoxyfile
. - We set
OPTIMIZE_OUTPUT_FOR_C
in order to get more C appropriate output. - We use
doxygen_add_docs
to add the target, this will generate adoxyfile
for you.
✅ DO provide a CMake option of the form <SDK_NAME>_BUILD_SAMPLES
that includes all samples in the build. For example:
⛔️ DO NOT install samples by default.
Formatting
✅ DO use clang-format for formatting your code. Use the common clang-format
options from Engineering Systems.
In general, clang-format will format your code correctly and ensure consistency. However, these are few additional rules to keep in mind.
✅ DO place all conditional or loop statements on one line, or add braces to identify the conditional/looping block.
✅ DO add comments to closing braces. Adding a comment to closing braces can help when you are reading code because you don’t have to find the begin brace to know what is going on.
✅ DO add comments to closing preprocessor directives to make them easier to understand. For example:
⚠️ YOU SHOULD NOT use parens in return statements when it isn’t necessary.
✅ DO place constants on the right of comparisons. For example if (a == 0)
and not if (0 == a)
✅ DO include a comment for falling through a non-empty case
statement. For example:
⚠️ YOU SHOULD NOT use goto
statements. The main place where goto
statements can be usefully employed is to break out of several levels of switch
, for
, or while
nesting, although the need to do such a thing may indicate that the inner constructs should be broken out into a separate function with a success/failure return code. When a goto
is necessary, the accompanying label should be alone on a line and to the left of the code that follows. The goto
should be commented as to its utility and purpose.
Complexity Management
☑️ YOU SHOULD Initialize all variables. Only leave them uninitialized if there is a real performance reason to do so. Use static and dynamic analysis tools to check for uninitialized access. You may leave “result” variables uninitialized so long as they clearly do not escape from the innermost lexical scope.
☑️ YOU SHOULD limit function bodies to one page of code (40 lines, approximately).
✅ DO document null statements. Always document a null body for a for
or while
statement so that it is clear the null body is intentional.
✅ DO use explicit comparisons when testing for failure. Use if (FAIL != f())
rather than if (f())
, even though FAIL may have the value 0 which C considers to be false. An explicit test will help you out later when somebody decides that a failure return should be -1 instead of 0.
Explicit comparison should be used even if the comparison value will never change. e.g. if (!(bufsize % sizeof(int)))
should be written as if (0 == (bufsize % sizeof(int))
to reflect the numeric (not boolean) nature of the test.
A frequent trouble spot is using strcmp
to test for string equality. You should never use a default action. The preferred approach is to use an inline function:
~ Should ⚠️ YOU SHOULD NOT use embedded assignments. There is a time and a place for embedded assignment statements. In some constructs, there is no better way to accomplish the results without making the code bulkier and less readable.
However, one should consider the tradeoff between increased speed and decreased maintainability that results when embedded assignments are used in artificial places.
Memory management
✅ DO let the caller allocate memory, then pass a pointer to it to the functions; e.g. int az_iot_create_client(az_iot_client* client);
.
The developer could then write code similar to:
☑️ YOU SHOULD If you must allocate memory within the client library, do so using user-overridable functions.
TODO: Should this be in azure core, or specific to a library?
Secure functions
⚠️ YOU SHOULD NOT use Microsoft security enhanced versions of CRT functions to implement APIs that need to be portable across many platforms. Such code is not portable and is not C99 compatible. Adding that code to your API will complicate the implementation with little to no gain from the security side. See arguments against.
TODO: Verify with the security team, and what are the alternatives?
Miscellaneous
⛔️ DO NOT use implicit assignment inside a test. This is generally an accidental omission of the second =
of the logical compare. The following is confusing and prone to error.
Does the programmer really mean assignment here? Sometimes yes, but usually no. Instead use explicit tests and avoid assignment with an implicit test. The recommended form is to do the assignment before doing the test:
✅ DO use the register sparingly to indicate the variables that you think are most critical. Modern compilers will put variables in registers automatically. In extreme cases, mark the 2-4 most critical values as register
and mark the rest as REGISTER
. The latter can be #defined
to register on those machines with many registers.
✅ DO be const
correct. C provides the const
keyword to allow passing as parameters objects that cannot change to indicate when a method doesn’t modify its object. Using const
in all the right places is called “const correctness.”
✅ DO use #if
instead of #ifdef
. For example:
Someone else might compile the code with turned-of debug info like:
Alway use #if
if you have to use the preprocessor. This works fine, and does the right thing, even if DEBUG
is not defined at all (!)
If you really need to test whether a symbol is defined or not, test it with the defined()
construct, which allows you to add more things later to the conditional without editing text that’s already in the program:
✅ DO
Use #if
to comment out large code blocks.
Sometimes large blocks of code need to be commented out for testing. The easiest way to do this is with an #if 0
block:
You can’t use /**/
style comments because comments can’t contain comments and a large block of your code will probably contain connects.
Do not use #if 0
directly. Instead, use descriptive macro names:
Always add a short comment explaining why it is commented out.
⛔️ DO NOT put data definitions in header files. For example, this should be avoided:
It’s bad magic to have space consuming code silently inserted through the innocent use of header files. It’s not common practice to define variables in the header file, so it will not occur to developers to look for this when there are problems. Instead, define the variable once in a source file and then use an extern
statement to reference it in the header file.
⛔️ DO NOT use magic numbers. A magic number is a bare naked number used in source code. It’s magic because no-one will know what it means after a while. This significantly reduces maintainability. For example:
Instead of magic numbers, use a real name that means something. You can use #define
, constants, or enums as names. For example:
Prefer enum
values since the debugger can display the label and value and no memory is allocated. If you use const
, memory is allocated. If you use #define
, the debugger cannot display the label.
✅ DO check every system call for an error return, unless you know you wish to ignore errors. For example, printf
returns an error code but it is rarely relevant. Cast the return to (void) if you do not care about the error code.
~
✅ DO include the system error text when reporting system error messages.
✅ DO check every call to malloc
or realloc
.
We recommend that you use a library-specific wrapper for memory allocation calls that always do the right thing.