Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Current »

Gateway's metering is a scalable, flexible, integrated usage metering solution that makes use of Elasticsearch for data storage, management, and analysis. Configure usage metering to send batched storage and network statistics to the Elasticsearch server at whatever interval needed. Access the metering data by querying Elasticsearch directly or through the Content Management API. Metering gathers the data needed to manage the business of the organization:

  • Current usage numbers allow evaluating usage quotas.

  • Usage over time allows billing generation.

  • Historical usage queries populate graphs for easy monitoring from the dashboard.

Note

Metering replaces the legacy CSMeter package (csmeter and cshistory utilities), which is deprecated and no longer included with the Gateway.

Configuring Metering

Metering requires minimal configuration to implement:

  • metering.enabled (disabled by default)

  • storage_cluster.indexerHosts (must be defined for metering)

Gateway includes other configuration parameters that are specific to controlling metering for special cases.

See Gateway Configuration.

Metering Statistics

Gateway emits two types of usage statistics: storage and network.

Type

Statistic

Description

Notes

Storage

bytesSize

Sums the bytes of content (logical objects, including versions) uploaded for storage in the context. Summing the individual Content-Length headers of the objects provides this value.

Swarm storage usage statistics are reported by bucket, domain, and tenant. Untenanted domains are grouped into a synthesized "_system" tenant.

The context is a domain or a domain and bucket.

The absence of a bucket returns the storage used by unnamed objects.

bytesStored

Sums the bytes of space used on disk by all Swarm objects in the context, including all replicas and erasure-coded segments. This is also commonly called raw storage.

This statistic yields a value that is the expected number of replicas in the cluster and does not account for temporary under- or over-replication that may exist in the cluster.

objectsStored

Counts the number of unique objects being stored in the context.

Network

bytesIn

Sums the bandwidth usages from clients to the Gateway.

Network usage is reported at the context to which the requests are made; the particular bucket or domain .

Network usage only includes storage operations and excludes Management API requests.

bytesOut

Sums the bandwidth usages from the Gateway to clients.

opCount

Counts the number of client operations.

Metering API

The API for metering is part of the Gateway's Content Management API.

Request

Each tenant, domain, and bucket has a subresource prefix, meter:

  • Tenant t1: /_admin/manage/tenants/t1/meter/

  • Domain d1: /_admin/manage/tenants/t1/domains/d1/meter/

  • Bucket b1/_admin/manage/tenants/t1/domains/d1/buckets/b1/meter/

Under meter, this is the endpoint for the specific context (tenant, domain, bucket):

Metering endpoint for context
meter/usage/{metric}
	?from={startDate}
	&to={endDate}
	&groupBy={groupBy}

Value

Required

Case


{metric}

Yes

Case-sensitive

Specifies which metric to analyze:

  • bytesIn (from client to Swarm) 

  • bytesOut (from Swarm to client)

  • bytesSize (sum of logical objects' Content-Length values)

  • bytesStored (sum of physical disk storage consumed)

  • objectsStored (number of logical objects)

  • opCount (operation count, minus Management API requests)

These requests are supported for point-in-time (current) queries for untenanted objects: (v6.2)

  • bytesSize/untenanted (sum of logical objects' Content-Length values)

  • bytesStored/untenanted (sum of physical disk storage consumed)

  • objectsStored/untenanted (number of logical objects)

{startDate}

Yes


YYYY-MM-DDT00:00Z
YYYY-MM-DDThh:mmZ
YYYY-MM-DDThh:mm:ssZ

{endDate}

Yes


YYYY-MM-DDT00:00Z
YYYY-MM-DDThh:mmZ
YYYY-MM-DDThh:mm:ssZ

Must be later than {startDate}.

{groupBy}

No

Case-sensitive

Specifies which time increment to group by:

  • hour

  • day 

  • unspecified - no grouping: date range is collapsed to a single value

Note

Group by aggregates metrics using average for storage metrics and sum for network metrics.

Add /children to fetch a metric for the children of a context (either the tenant's domains or the domain's buckets):

Metering endpoint for child of context
meter/usage/{metric}/children
	?from={startDate}
	&to={endDate}
	&groupBy={groupBy}

Run a point-in-time query specific to storage metrics:

Point-in-time storage metrics
meter/usage/bytesSize/current
meter/usage/bytesStored/current
meter/usage/objectsStored/current

meter/usage/bytesSize/current/children
meter/usage/bytesStored/current/children
meter/usage/objectsStored/current/children

meter/usage/bytesSize/untenanted/current
meter/usage/bytesStored/untenanted/current
meter/usage/objectsStored/untenanted/current

Note

For all current metrics, no date range is required and grouping is not applicable.

Response

The response to a query is an array of objects ("rows"), with fields that correspond to the data for each entry. These are the possible fields:

tenant
domain
bucket

The name of the applicable tenant, domain, or bucket for the object. Untenanted domains are grouped within the "_system" tenant name. Unnamed objects in a domain are grouped within an empty string ("") bucket name. The domain can also be an empty string, recording requests at the tenant level outside of any domain. If the domain or bucket had activity during the requested timeframe, but the name is not available because it has been deleted, the UUID is returned instead. The UUID corresponds to the former domain or bucket's Castor-System-Alias value.

bytesIn
bytesOut
bytesSize
bytesStored
objectsStored
opCount

The value for the metric requested, which corresponds to the {metric} from the request.

timestamp

For queries grouped by time, the timestamp for a given time grouping.

the timestamp identifies which of those 7 days relates to each result if grouping by day across a week of time.

An empty list ("[]") is returned if no records exist for the query range. When fetching the children of a context, if a child has no data for the query range, the record is excluded from the response.

Note

Using /children and a time grouping together may result in additional rows to express each time/child combination, as RDBMS queries with multiple GROUP BY arguments return separate rows per every combination.

Example Metering Requests

Example request/response

Following is a result from a query for /children that uses a day  grouping. The target of this query is the domain domain1, which belongs to tenant tenant1.

Note: the results are for the children of the domain, which are the buckets (as opposed to the children of a tenant, which are the domains):

Example metering request/response
GET /_admin/manage/tenants/tenant1/domains/domain1/meter/usage/bytesIn/children
	?from=2015-07-01T00:00Z&to=2015-07-03T00:00Z&groupBy=day
[
   {
       tenant: "tenant1",
       domain: "domain1",
       timestamp: "2015-07-01T00:00:00.000Z",
       bucket: "research",
       bytesIn: 27277
   }, {
       tenant: "tenant1",
       domain: "domain1",
       timestamp: "2015-07-01T00:00:00.000Z",
       bucket: "archive",
       bytesIn: 18771
   }, {
       tenant: "tenant1",
       domain: "domain1",
       timestamp: "2015-07-02T00:00:00.000Z",
       bucket: "research",
       bytesIn: 27855
   }, {
       tenant: "tenant1",
       domain: "domain1",
       timestamp: "2015-07-02T00:00:00.000Z",
       bucket: "archive",
       bytesIn: 19645
   }
]

Common billing queries

These are queries that are common when integrating with billing systems where charges for bandwidth in/out and storage are calculated at the end of a calendar month.  In these examples, the period being queried is Midnight 2016-06-01 UTC through Midnight 2016-07-01 UTC. Note: the storage numbers are the average storage over the month while the bandwidth is the total at the end of the month.

System tenant raw storage by domain
GET /_admin/manage/tenants/_system/meter/usage/bytesStored/children
   ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z
Tenant 'bravo' raw storage
GET /_admin/manage/tenants/bravo/meter/usage/bytesStored
  ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z
Domain 'xray.example.com' raw storage
GET /_admin/manage/tenants/delta/domains/xray.example.com/meter/usage/bytesStored
  ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z
Bandwidth IN for domains of tenant 'tango'
GET /_admin/manage/tenants/tango/meter/usage/bytesIn/children
  ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z
Bandwidth OUT for domains of tenant 'tango'
GET /_admin/manage/tenants/tango/meter/usage/bytesOut/children
  ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z
Raw storage for domain 'uniform.example.com' by bucket
GET /_admin/manage/tenants/tango/domains/uniform.example.com/meter/usage/bytesStored/children
  ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z
Logical storage for domain 'uniform.example.com' by bucket
GET /_admin/manage/tenants/tango/domains/uniform.example.com/meter/usage/bytesSize/children
  ?from=2016-06-01T00:00:00Z&to=2016-07-01T00:00:00Z

Index Generation for Metering

Metering uses a different index for each day, making it efficient to expire old data. The daily index is created by utilizing Elasticsearch’s index alias to combine the records into a queryable whole. To support this, each gateway runs a daily maintenance task that deletes any indices older than the retention period and adds the new daily index to the alias. This maintenance is scheduled with offsets to avoid having multiple gateways performing the task simultaneously.

Note

Statistics are batched for a period of time. The date samples are assigned is the date when they are written, not when they are collected. A new index for the new day is created automatically.

Retaining Data for De-provisioned Resources

Do not inadvertently lose access to metering data when de-provisioning a tenant or a domain. There is no longer a Content Management API path to which to refer for metering queries when a tenant or domain is removed.

Best practice - Retain the historical metering data for each decommissioned entity:

  1. Change ownership of the tenant or domain to be an admin user.

  2. Update the Policy to remove non-admin access.

  3. Purge all objects being stored in the decommissioned domain (or all domains for a decommissioned tenant).

  4. Important: Do not delete the empty tenant or domain; retain it as is.

Delete the decommissioned tenant or domain once no longer needed to retain the historical usage data. The domain or bucket name is not available but it may still be returned in metering queries covering an earlier time once deleted. The UUID is returned instead of the name.

  • No labels