AWS Bedrock with .NET: Cost Tracking via Inference Profiles

Author(s): Ajay Kumar

Last updated: 04 Jul 2026

ℹ️ Part of the Bedrock .NET series

This article extends the earlier Bedrock posts on getting started, production integration, and Converse API. Here we focus specifically on cost tracking using inference profiles for foundation model traffic.

The Problem: Your AI Bill Grew, But You Cannot Explain It

A familiar story: your team ships two AI features quickly. One is a customer support assistant. Another is a content summarizer for internal dashboards. A few weeks later, finance asks a simple question: "Which feature is driving most of the Bedrock spend?". You open Cost Explorer and see Bedrock charges, but the numbers are blended. You know usage increased, but you cannot confidently split cost by feature, environment, or business owner.

This is where many teams get stuck. They have technical success, but weak cost observability. Without structured attribution, every budgeting conversation becomes guesswork, and every optimization initiative starts late.

ℹ️ Why this matters early

If you add cost tracking only after usage scales, you have historical spend but poor historical context. It is much harder to recover attribution later than to design it from day one.

1. What Are We Doing Here?

We are implementing application-level cost tracking for Amazon Bedrock by routing model traffic through inference profiles and attaching meaningful tags. Inference profiles are not just routing identifiers. They are a practical control point to separate usage across products, tenants, or environments.

Think of it like setting up dedicated electricity meters in a building. The total power bill may be one number, but separate meters for each floor tell you exactly where energy is consumed. Inference profiles play a similar role for model invocation paths.

In this article, our target outcome is:

Create one or more inference profiles per workload segment (for example, support-bot-prod, support-bot-staging).
Tag those profiles with cost allocation dimensions (team, product, environment, cost-center).
Invoke Bedrock using the inference profile identifier from .NET so usage maps cleanly to business context.
Use those tags in billing and reporting workflows.

Figure 1 : Architecture showing .NET services invoking Bedrock through dedicated inference profiles with cost tracking

2. When and Why Are We Doing This? (Application-Level Cost Tracking)

When to adopt this pattern

When multiple AI features share the same Bedrock account and region.
When teams need showback or chargeback between business units.
When you run separate environments (dev, staging, prod) and want clean spend boundaries.
When your product managers need usage-to-value comparisons per feature, not just total AI spend.
When finance requires auditable allocation using tags and cost reports.

Why this is application-level, not only account-level

Account-level reporting is useful for cloud governance, but product teams optimize at the application level. If one feature uses verbose prompts, long outputs, and high-cost models, you need to see that feature directly. Otherwise, optimization efforts are broad and ineffective.

Application-level tracking gives you practical decisions:

Should this feature move from a larger model to a smaller model?
Should we cap output tokens in this workflow?
Which prompt templates are expensive relative to user value?
Which environment leaked unexpected traffic?

💡 Simple governance rule

If a feature can be independently prioritized by product and budget, it should have its own inference profile and tagging policy.

3. How Are We Doing It?

We will implement this in three layers: profile creation, profile tagging, and profile-based invocation from .NET. You can create profiles manually for quick setup or through CloudFormation for repeatable infrastructure.

Approach A: Manual Inference Profile Creation (Console + CLI)

This is the fastest way to validate your design. Use it first when you are proving naming conventions and tag strategy with stakeholders.

Open AWS Console and navigate to Bedrock inference profile management.
Create a profile with a clear workload-oriented name.
Select the model routing target as required by your workload.
Attach standard tags during creation.
Copy the generated profile identifier for app configuration.

Figure 2 : AWS Bedrock console showing the inference profile creation form

Figure 3 : Bedrock console showing inference profile details with attached cost allocation tags

You can also do a similar operation with CLI for scripted experimentation:


aws bedrock create-inference-profile \
  --inference-profile-name support-bot-prod \
  --description "Inference profile for support assistant in production" \
  --model-source copyFrom=arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0

aws bedrock tag-resource \
  --resource-arn arn:aws:bedrock:us-east-1:123456789012:inference-profile/support-bot-prod \
  --tags key=environment,value=prod key=application,value=support-bot key=cost-center,value=cc-410

Code Sample #1 : Example CLI flow to create and tag an inference profile (illustrative)

ℹ️ CLI note

CLI parameters evolve over time. Validate the latest command shape in official AWS Bedrock documentation before running in production automation.

Approach B: Creation via CloudFormation (recommended for production)

Manual creation is useful for discovery. Production systems should move to Infrastructure as Code so profile names, tags, and routing definitions stay consistent across environments and accounts.

A typical CloudFormation template captures:

Profile logical name and human-readable name.
Model source or routing reference.
Environment-aware tags.
Outputs for profile identifiers consumed by application configuration pipelines.


# yaml-language-server: $schema=https://s3.amazonaws.com/cfn-resource-specifications-us-east-1-prod/schemas/2.15.0/all-spec.json
AWSTemplateFormatVersion: "2010-09-09"
Description: >
    Creates an AWS Bedrock Application Inference Profile by copying from a
    foundation model or system-defined cross-region inference profile.
    Deploy once per environment; the output ARN can be dropped straight into
    AwsBedrockConverse as the modelId argument.

Parameters:
    InferenceProfileName:
        Type: String
        Description: >
            Display name for the application inference profile.
            Allowed characters: alphanumeric, space, hyphen, underscore.
            Max 64 characters.
        Default: AwsBedrockConverse-ChatBot-Inference-Profile-CF
        AllowedPattern: "^([0-9a-zA-Z][ _-]?)+$"
        MinLength: 1
        MaxLength: 64

    InferenceProfileDescription:
        Type: String
        Description: Human-readable description of what this profile is used for.
        Default: Application inference profile for the AwsBedrockConverse chat-loop demo. Created via CloudFormation template.
        AllowedPattern: "^([0-9a-zA-Z:.][ _-]?)+$"
        MinLength: 1
        MaxLength: 200

    CopyFromArn:
        Type: String
        Description: >
            ARN of the foundation model or system-defined cross-region inference
            profile to copy into this application profile.
        Default: "arn:aws:bedrock:us-west-2:977098985944:inference-profile/us.meta.llama3-1-8b-instruct-v1:0"

    Environment:
        Type: String
        Description: Deployment environment label (used in resource tags).
        Default: dev
        AllowedValues: [dev, staging, prod]

    BillingServiceName:
        Type: String
        Description: Billing cost-allocation tag value.
        Default: Default-App-Billing-Service

    TeamName:
        Type: String
        Description: Team or squad that owns this resource.
        Default: PoC-Team

Resources:
    BedrockApplicationInferenceProfile:
        Type: AWS::Bedrock::ApplicationInferenceProfile
        Properties:
            InferenceProfileName: !Ref InferenceProfileName
            Description: !Ref InferenceProfileDescription
            ModelSource:
                CopyFrom: !Ref CopyFromArn
            Tags:
                - Key: environment
                    Value: !Ref Environment
                - Key: rx:billing:service-name
                    Value: !Ref BillingServiceName
                - Key: team
                    Value: !Ref TeamName
                - Key: feature
                    Value: chat-loop
                - Key: managed-by
                    Value: cloudformation
                - Key: project
                    Value: aws-experiments
                - Key: AppName
                    Value: SampleBedrockAppUsingAIPUsingCloudFormation

Outputs:
    InferenceProfileId:
        Description: Unique identifier of the created application inference profile.
        Value: !GetAtt BedrockApplicationInferenceProfile.InferenceProfileId
        Export:
            Name: !Sub "${AWS::StackName}-InferenceProfileId"

    InferenceProfileArn:
        Description: >
            Full ARN of the inference profile.
        Value: !GetAtt BedrockApplicationInferenceProfile.InferenceProfileArn
        Export:
            Name: !Sub "${AWS::StackName}-InferenceProfileArn"

    InferenceProfileStatus:
        Description: Status of the inference profile (ACTIVE once ready).
        Value: !GetAtt BedrockApplicationInferenceProfile.Status

    CreatedAt:
        Description: Timestamp when the inference profile was created.
        Value: !GetAtt BedrockApplicationInferenceProfile.CreatedAt

Code Sample #2 : CloudFormation template for application inference profile with cost allocation tags (guardrails excluded)

Figure 4 : CloudFormation stack outputs displaying the inference profile ARN and ID for application configuration

💡 Why CloudFormation wins long-term

It gives repeatability, code review visibility, and drift control. Most importantly, it prevents accidental tag omissions that break cost allocation reports.

Approach C: Tagging Strategy for Cost Attribution

Tags are the bridge between technical usage and financial reporting. A profile without tags is like a transaction without metadata: it happened, but you cannot classify it reliably.

Start with a small, mandatory taxonomy:

application: product or feature name (for example, support-bot).
environment: dev, staging, prod.
owner-team: team accountable for optimization decisions.
cost-center: finance mapping key.
workload-type: chatbot, summarization, classification, extraction.

Keep values stable and lowercase. Avoid free-text variants like TeamA, team-a, and Team-A because they fragment reports.


{
  "requiredTags": [
    "application",
    "environment",
    "owner-team",
    "cost-center",
    "workload-type"
  ],
  "allowedEnvironments": ["dev", "staging", "prod"],
  "namingRule": "lowercase-with-hyphens"
}

Code Sample #3 : Suggested tag policy contract for inference profiles

Figure 5 : AWS Cost Explorer showing Bedrock costs filtered and grouped by inference profile tags (application, environment, team)

Using the Inference Profile in .NET

Creation and tagging are only half the story. To make attribution real, your application must invoke Bedrock through the profile identifier, not by directly coupling to a model identifier in each feature.


using Amazon.BedrockRuntime;
using Amazon.BedrockRuntime.Model;

var client = new AmazonBedrockRuntimeClient(Amazon.RegionEndpoint.USEast1);

string inferenceProfileId = configuration["Bedrock:InferenceProfileId"]
    ?? throw new InvalidOperationException("Bedrock:InferenceProfileId is not configured.");

var request = new ConverseRequest
{
    ModelId = inferenceProfileId,
    Messages = new List<Message>
    {
        new()
        {
            Role = "user",
            Content = new List<ContentBlock>
            {
                new() { Text = "Summarize this support ticket in 3 bullet points." }
            }
        }
    },
    InferenceConfig = new InferenceConfiguration
    {
        MaxTokens = 300,
        Temperature = 0.2f
    }
};

var response = await client.ConverseAsync(request);
string output = response.Output.Message.Content.FirstOrDefault(c => c.Text != null)?.Text ?? string.Empty;

Code Sample #4 : Converse request in .NET using inference profile ID

ℹ️ Avoid hidden direct model IDs

If one code path still invokes a raw model ID, that traffic can bypass your cost allocation design. Enforce profile ID usage through shared abstractions and code review checks.

Figure 6 : Inference profile metrics dashboard showing invocation count, latency, token usage, and cost attribution data

Operational Checklist

Create separate profiles for each environment at minimum.
Require tags at creation time and block untagged resources in CI validation.
Store profile IDs in configuration per environment, not hardcoded in source.
Track token usage metrics and correlate with tag-based cost reports weekly.
Review top-cost profiles monthly and optimize prompt length, output caps, or model class.

Summary

Application-level Bedrock cost tracking becomes much clearer when you standardize on inference profiles and tags. You are not just creating a billing view; you are building a feedback loop for product decisions.

What we did: Designed cost attribution through inference profiles for foundation model traffic.
When and why: Used this pattern whenever multiple AI workloads need clean ownership and optimization accountability.
How we did it: Covered manual profile creation, CloudFormation-based provisioning, and mandatory tagging strategy, plus .NET invocation through profile IDs.

With this setup, the next time finance asks "where did the spend come from?", your answer is not a guess. It is a report.

References & Further Reading

ℹ️ Next in the series

A good follow-up is building a small cost dashboard that combines Bedrock token usage telemetry and tag-based billing views for weekly optimization reviews.

For reference, the code repository being discussed is available at github: https://github.com/ajaysskumar/ai-playground

Thanks for reading through. Please share feedback, if any, in comments or on my email ajay.a338@gmail.com