Content Metering

Gateway's metering is a scalable, flexible, integrated usage metering solution that makes use of Elasticsearch for data storage, management, and analysis. Configure usage metering to send batched storage and network statistics to the Elasticsearch server at whatever interval needed. Access the metering data by querying Elasticsearch directly or through the https://perifery.atlassian.net/wiki/spaces/public/pages/2443822389. Metering gathers the data needed to manage the business of the organization:

  • Current usage numbers allow evaluating usage quotas.

  • Usage over time allows billing generation.

  • Historical usage queries populate graphs for easy monitoring from the dashboard.

Note

Metering replaces the legacy CSMeter package (csmeter and cshistory utilities), which is deprecated and no longer included with the Gateway.

Configuring Metering

Metering requires minimal configuration to implement:

  • metering.enabled (disabled by default)

  • storage_cluster.indexerHosts (must be defined for metering)

Gateway includes other configuration parameters that are specific to controlling metering for special cases.

See https://perifery.atlassian.net/wiki/spaces/public/pages/2443810201.

Metering Statistics

Gateway emits two types of usage statistics: storage and network.

Type

Statistic

Description

Notes

Type

Statistic

Description

Notes

Storage

bytesSize

Sums the bytes of content (logical objects, including versions) uploaded for storage in the context. Summing the individual Content-Length headers of the objects provides this value.

Swarm storage usage statistics are reported by bucket, domain, and tenant. Untenanted domains are grouped into a synthesized "_system" tenant.

The context is a domain or a domain and bucket.

The absence of a bucket returns the storage used by unnamed objects.

bytesStored

Sums the bytes of space used on disk by all Swarm objects in the context, including all replicas and erasure-coded segments. This is also commonly called raw storage.

This statistic yields a value that is the expected number of replicas in the cluster and does not account for temporary under- or over-replication that may exist in the cluster.

objectsStored

Counts the number of unique objects being stored in the context.

Network

bytesIn

Sums the bandwidth usages from clients to the Gateway.

Network usage is reported at the context to which the requests are made; the particular bucket or domain .

Network usage only includes storage operations and excludes Management API requests.

bytesOut

Sums the bandwidth usages from the Gateway to clients.

opCount

Counts the number of client operations.

Metering API

The API for metering is part of the Gateway's https://perifery.atlassian.net/wiki/spaces/public/pages/2443822389.

Request

Each tenant, domain, and bucket has a subresource prefix, meter:

  • Tenant t1: /_admin/manage/tenants/t1/meter/

  • Domain d1: /_admin/manage/tenants/t1/domains/d1/meter/

  • Bucket b1/_admin/manage/tenants/t1/domains/d1/buckets/b1/meter/

Under meter, this is the endpoint for the specific context (tenant, domain, bucket):

Metering Endpoint for Context
meter/usage/{metric} ?from={startDate} &to={endDate} &groupBy={groupBy}

Value

Required

Case



Value

Required

Case



{metric}

Yes

Case-sensitive

Specifies which metric to analyze:

  • bytesIn (from client to Swarm) 

  • bytesOut (from Swarm to client)

  • bytesSize (sum of logical objects' Content-Length values)

  • bytesStored (sum of physical disk storage consumed)

  • objectsStored (number of logical objects)

  • opCount (operation count, minus Management API requests)

These requests are supported for point-in-time (current) queries for untenanted objects: (v6.2)

  • bytesSize/untenanted (sum of logical objects' Content-Length values)

  • bytesStored/untenanted (sum of physical disk storage consumed)

  • objectsStored/untenanted (number of logical objects)

{startDate}

Yes



YYYY-MM-DDT00:00Z
YYYY-MM-DDThh:mmZ
YYYY-MM-DDThh:mm:ssZ

{endDate}

Yes



YYYY-MM-DDT00:00Z
YYYY-MM-DDThh:mmZ
YYYY-MM-DDThh:mm:ssZ

Must be later than {startDate}.

{groupBy}

No

Case-sensitive

Specifies which time increment to group by:

  • hour

  • day 

  • unspecified - no grouping: date range is collapsed to a single value

Note

Group by aggregates metrics using average for storage metrics and sum for network metrics.

Add /children to fetch a metric for the children of a context (either the tenant's domains or the domain's buckets):

Metering Endpoint for Child of Context
meter/usage/{metric}/children ?from={startDate} &to={endDate} &groupBy={groupBy}

Run a point-in-time query specific to storage metrics:

Point-in-Time Storage Metrics
meter/usage/bytesSize/current meter/usage/bytesStored/current meter/usage/objectsStored/current meter/usage/bytesSize/current/children meter/usage/bytesStored/current/children meter/usage/objectsStored/current/children meter/usage/bytesSize/untenanted/current meter/usage/bytesStored/untenanted/current meter/usage/objectsStored/untenanted/current

Note

For all current metrics, no date range is required and grouping is not applicable.

Response

The response to a query is an array of objects ("rows"), with fields that correspond to the data for each entry. These are the possible fields:

tenant
domain
bucket

The name of the applicable tenant, domain, or bucket for the object. Untenanted domains are grouped within the "_system" tenant name. Unnamed objects in a domain are grouped within an empty string ("") bucket name. The domain can also be an empty string, recording requests at the tenant level outside of any domain. If the domain or bucket had activity during the requested timeframe, but the name is not available because it has been deleted, the UUID is returned instead. The UUID corresponds to the former domain or bucket's Castor-System-Alias value.

bytesIn
bytesOut
bytesSize
bytesStored
objectsStored
opCount

The value for the metric requested, which corresponds to the {metric} from the request.

timestamp

For queries grouped by time, the timestamp for a given time grouping.

the timestamp identifies which of those 7 days relates to each result if grouping by day across a week of time.

An empty list ("[]") is returned if no records exist for the query range. When fetching the children of a context, if a child has no data for the query range, the record is excluded from the response.

Example Metering Requests

Example Request/Response

Following is a result from a query for /children that uses a day  grouping. The target of this query is the domain domain1, which belongs to tenant tenant1.

Example Metering Request/Response

Common Billing Queries

These are queries that are common when integrating with billing systems where charges for bandwidth in/out and storage are calculated at the end of a calendar month.  In these examples, the period being queried is Midnight 2016-06-01 UTC through Midnight 2016-07-01 UTC. Note: the storage numbers are the average storage over the month while the bandwidth is the total at the end of the month.

System Tenant Raw Storage by Domain
Tenant 'bravo' Raw Storage
Domain 'xray.example.com' Raw Storage
Bandwidth IN for Domains of Tenant 'tango'
Bandwidth OUT for Domains of Tenant 'tango'
Raw Storage for Domain 'uniform.example.com' by Bucket
Logical Storage for Domain 'uniform.example.com' by Bucket

Index Generation for Metering

Metering uses a different index for each day, making it efficient to expire old data. The daily index is created by utilizing Elasticsearch’s index alias to combine the records into a queryable whole. To support this, each gateway runs a daily maintenance task that deletes any indices older than the retention period and adds the new daily index to the alias. This maintenance is scheduled with offsets to avoid having multiple gateways performing the task simultaneously.

Retaining Data for De-Provisioned Resources

Do not inadvertently lose access to metering data when de-provisioning a tenant or a domain. There is no longer a Content Management API path to which to refer for metering queries when a tenant or domain is removed.

  1. Change ownership of the tenant or domain to be an admin user.

  2. Update the Policy to remove non-admin access.

  3. Purge all objects being stored in the decommissioned domain (or all domains for a decommissioned tenant).

Delete the decommissioned tenant or domain once no longer needed to retain the historical usage data. The domain or bucket name is not available but it may still be returned in metering queries covering an earlier time once deleted. The UUID is returned instead of the name.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.