| <html><body> |
| <style> |
| |
| body, h1, h2, h3, div, span, p, pre, a { |
| margin: 0; |
| padding: 0; |
| border: 0; |
| font-weight: inherit; |
| font-style: inherit; |
| font-size: 100%; |
| font-family: inherit; |
| vertical-align: baseline; |
| } |
| |
| body { |
| font-size: 13px; |
| padding: 1em; |
| } |
| |
| h1 { |
| font-size: 26px; |
| margin-bottom: 1em; |
| } |
| |
| h2 { |
| font-size: 24px; |
| margin-bottom: 1em; |
| } |
| |
| h3 { |
| font-size: 20px; |
| margin-bottom: 1em; |
| margin-top: 1em; |
| } |
| |
| pre, code { |
| line-height: 1.5; |
| font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace; |
| } |
| |
| pre { |
| margin-top: 0.5em; |
| } |
| |
| h1, h2, h3, p { |
| font-family: Arial, sans serif; |
| } |
| |
| h1, h2, h3 { |
| border-bottom: solid #CCC 1px; |
| } |
| |
| .toc_element { |
| margin-top: 0.5em; |
| } |
| |
| .firstline { |
| margin-left: 2 em; |
| } |
| |
| .method { |
| margin-top: 1em; |
| border: solid 1px #CCC; |
| padding: 1em; |
| background: #EEE; |
| } |
| |
| .details { |
| font-weight: bold; |
| font-size: 14px; |
| } |
| |
| </style> |
| |
| <h1><a href="dlp_v2.html">Cloud Data Loss Prevention (DLP) API</a> . <a href="dlp_v2.projects.html">projects</a> . <a href="dlp_v2.projects.dlpJobs.html">dlpJobs</a></h1> |
| <h2>Instance Methods</h2> |
| <p class="toc_element"> |
| <code><a href="#cancel">cancel(name, body=None, x__xgafv=None)</a></code></p> |
| <p class="firstline">Starts asynchronous cancellation on a long-running DlpJob. The server</p> |
| <p class="toc_element"> |
| <code><a href="#create">create(parent, body, x__xgafv=None)</a></code></p> |
| <p class="firstline">Creates a new job to inspect storage or calculate risk metrics.</p> |
| <p class="toc_element"> |
| <code><a href="#delete">delete(name, x__xgafv=None)</a></code></p> |
| <p class="firstline">Deletes a long-running DlpJob. This method indicates that the client is</p> |
| <p class="toc_element"> |
| <code><a href="#get">get(name, x__xgafv=None)</a></code></p> |
| <p class="firstline">Gets the latest state of a long-running DlpJob.</p> |
| <p class="toc_element"> |
| <code><a href="#list">list(parent, orderBy=None, type=None, pageSize=None, pageToken=None, x__xgafv=None, filter=None)</a></code></p> |
| <p class="firstline">Lists DlpJobs that match the specified filter in the request.</p> |
| <p class="toc_element"> |
| <code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p> |
| <p class="firstline">Retrieves the next page of results.</p> |
| <h3>Method Details</h3> |
| <div class="method"> |
| <code class="details" id="cancel">cancel(name, body=None, x__xgafv=None)</code> |
| <pre>Starts asynchronous cancellation on a long-running DlpJob. The server |
| makes a best effort to cancel the DlpJob, but success is not |
| guaranteed. |
| See https://cloud.google.com/dlp/docs/inspecting-storage and |
| https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. |
| |
| Args: |
| name: string, The name of the DlpJob resource to be cancelled. (required) |
| body: object, The request body. |
| The object takes the form of: |
| |
| { # The request message for canceling a DLP job. |
| } |
| |
| x__xgafv: string, V1 error format. |
| Allowed values |
| 1 - v1 error format |
| 2 - v2 error format |
| |
| Returns: |
| An object of the form: |
| |
| { # A generic empty message that you can re-use to avoid defining duplicated |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }</pre> |
| </div> |
| |
| <div class="method"> |
| <code class="details" id="create">create(parent, body, x__xgafv=None)</code> |
| <pre>Creates a new job to inspect storage or calculate risk metrics. |
| See https://cloud.google.com/dlp/docs/inspecting-storage and |
| https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. |
| |
| When no InfoTypes or CustomInfoTypes are specified in inspect jobs, the |
| system will automatically choose what detectors to run. By default this may |
| be all types, but may change over time as detectors are updated. |
| |
| Args: |
| parent: string, The parent resource name, for example projects/my-project-id. (required) |
| body: object, The request body. (required) |
| The object takes the form of: |
| |
| { # Request message for CreateDlpJobRequest. Used to initiate long running |
| # jobs such as calculating risk metrics or inspecting Google Cloud |
| # Storage. |
| "riskJob": { # Configuration for a risk analysis job. See |
| # https://cloud.google.com/dlp/docs/concepts-risk-analysis to learn more. |
| "privacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. |
| "numericalStatsConfig": { # Compute numerical stats over an individual column, including |
| # min, max, and quantiles. |
| "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are |
| # integer, float, date, datetime, timestamp, time. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what |
| # is called "journalist risk" in the literature, except the attack dataset is |
| # statistically modeled instead of being perfectly known. This can be done |
| # using publicly available data (like the US Census), or using a custom |
| # statistical model (indicated as one or several BigQuery tables), or by |
| # extrapolating from the distribution of values in the input dataset. |
| # A column with a semantic tag attached. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the |
| # same tag. [required] |
| { |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers column must appear in exactly one column |
| # of one auxiliary table. |
| { # An auxiliary table contains statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. |
| "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are |
| # defined for the l-diversity computation. When multiple fields are |
| # specified, they are considered a single composite key. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to |
| # figure out that one given individual appears in a de-identified dataset. |
| # Similarly to the k-map metric, we cannot compute δ-presence exactly without |
| # knowing the attack dataset, so we use a statistical model instead. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the |
| # same tag. [required] |
| { # A column with a semantic tag attached. |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers field must appear in exactly one |
| # field of one auxiliary table. |
| { # An auxiliary table containing statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "categoricalStatsConfig": { # Compute numerical stats over an individual column, including |
| # number of distinct values and value count distribution. |
| "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are |
| # supported except for arrays and structs. However, it may be more |
| # informative to use NumericalStats when the field type is supported, |
| # depending on the data. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. |
| "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a |
| # single individual. If the same entity_id is associated to multiple |
| # quasi-identifier tuples over distinct rows, we consider the entire |
| # collection of tuples as the composite quasi-identifier. This collection |
| # is a multiset: the order in which the different tuples appear in the |
| # dataset is ignored, but their frequency is taken into account. |
| # |
| # Important note: a maximum of 1000 rows can be associated to a single |
| # entity ID. If more rows are associated with the same entity ID, some |
| # might be ignored. |
| # single person. For example, in medical records the `EntityId` might be a |
| # patient identifier, or for financial records it might be an account |
| # identifier. This message is used when generalizations or analysis must take |
| # into account that multiple rows correspond to the same entity. |
| "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are |
| # specified, they are considered a single composite key. Structs and |
| # repeated data types are not supported; however, nested fields are |
| # supported so long as they are not structs themselves or nested within |
| # a repeated field. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| }, |
| "sourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| "actions": [ # Actions to execute at the completion of the job. Are executed in the order |
| # provided. |
| { # A task to execute on the completion of a job. |
| # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. |
| "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. |
| # OutputStorageConfig. Only a single instance of this action can be |
| # specified. |
| # Compatible with: Inspect, Risk |
| "outputConfig": { # Cloud repository for storing output. |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing |
| # dataset. If table_id is not set a new one will be generated |
| # for you with the following format: |
| # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for |
| # generating the date details. |
| # |
| # For Inspect, each column in an existing output table must have the same |
| # name, type, and mode of a field in the `Finding` object. |
| # |
| # For Risk, an existing output table should be the output of a previous |
| # Risk analysis job run on the same source table, with the same privacy |
| # metric and quasi-identifiers. Risk jobs that analyze the same table but |
| # compute a different privacy metric, or use different sets of |
| # quasi-identifiers, cannot store their results in the same table. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only |
| # used for Inspect and must be unspecified for Risk jobs. Columns are derived |
| # from the `Finding` object. If appending to an existing table, any columns |
| # from the predefined schema that are missing will be added. No columns in |
| # the existing table will be deleted. |
| # |
| # If unspecified, then all available columns will be used for a new table or |
| # an (existing) table with no schema, and no changes will be made to an |
| # existing table that has a schema. |
| }, |
| }, |
| "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's |
| # completion/failure. |
| # completion/failure. |
| }, |
| "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). |
| # Command Center (CSCC Alpha). |
| # This action is only available for projects which are parts of |
| # an organization and whitelisted for the alpha Cloud Security Command |
| # Center. |
| # The action will publish count of finding instances and their info types. |
| # The summary of findings will be persisted in CSCC and are governed by CSCC |
| # service-specific policy, see https://cloud.google.com/terms/service-terms |
| # Only a single instance of this action can be specified. |
| # Compatible with: Inspect |
| }, |
| "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. |
| # message contains a single field, `DlpJobName`, which is equal to the |
| # finished job's |
| # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). |
| # Compatible with: Inspect, Risk |
| "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given |
| # publishing access rights to the DLP API service account executing |
| # the long running DlpJob sending the notifications. |
| # Format is projects/{project}/topics/{topic}. |
| }, |
| }, |
| ], |
| }, |
| "jobId": "A String", # The job id can contain uppercase and lowercase letters, |
| # numbers, and hyphens; that is, it must match the regular |
| # expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 |
| # characters. Can be empty to allow the system to generate one. |
| "inspectJob": { |
| "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. |
| "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. |
| "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # |
| # A partition ID contains several dimensions: |
| # project ID and namespace ID. |
| "projectId": "A String", # The ID of the project to which the entities belong. |
| "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. |
| }, |
| "kind": { # A representation of a Datastore kind. # The kind to process. |
| "name": "A String", # The name of the kind. |
| }, |
| }, |
| "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. |
| "excludedFields": [ # References to fields excluded from scanning. This allows you to skip |
| # inspection of entire columns which you know have no findings. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the |
| # rest of the rows are omitted. If not set, or if set to 0, all rows will be |
| # scanned. Only one of rows_limit and rows_limit_percent can be specified. |
| # Cannot be used in conjunction with TimespanConfig. |
| "sampleMethod": "A String", |
| "identifyingFields": [ # References to fields uniquely identifying rows within the table. |
| # Nested fields in the format, like `person.birthdate.year`, are allowed. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows |
| # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and |
| # 100 means no limit. Defaults to 0. Only one of rows_limit and |
| # rows_limit_percent can be specified. Cannot be used in conjunction with |
| # TimespanConfig. |
| "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "timespanConfig": { # Configuration of the timespan of the items to include in scanning. |
| # Currently only supported when inspecting Google Cloud Storage and BigQuery. |
| "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. |
| # Used for data sources like Datastore or BigQuery. |
| # If not specified for BigQuery, table last modification timestamp |
| # is checked against given time span. |
| # The valid data types of the timestamp field are: |
| # for BigQuery - timestamp, date, datetime; |
| # for Datastore - timestamp. |
| # Datastore entity will be scanned if the timestamp property does not exist |
| # or its value is empty or invalid. |
| "name": "A String", # Name describing the field. |
| }, |
| "endTime": "A String", # Exclude files or rows newer than this value. |
| # If set to zero, no upper time limit is applied. |
| "startTime": "A String", # Exclude files or rows older than this value. |
| "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out |
| # a valid start_time to avoid scanning files that have not been modified |
| # since the last time the JobTrigger executed. This will be based on the |
| # time of the execution of the last run of the JobTrigger. |
| }, |
| "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. |
| # bucket. |
| "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger |
| # than this value then the rest of the bytes are omitted. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "sampleMethod": "A String", |
| "fileSet": { # Set of files to scan. # The set of one or more files to scan. |
| "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format |
| # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. |
| # |
| # If the url ends in a trailing slash, the bucket or directory represented |
| # by the url will be scanned non-recursively (content in sub-directories |
| # will not be scanned). This means that `gs://mybucket/` is equivalent to |
| # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to |
| # `gs://mybucket/directory/*`. |
| # |
| # Exactly one of `url` or `regex_file_set` must be set. |
| "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or |
| # `regex_file_set` must be set. |
| # expressions are used to allow fine-grained control over which files in the |
| # bucket to include. |
| # |
| # Included files are those that match at least one item in `include_regex` and |
| # do not match any items in `exclude_regex`. Note that a file that matches |
| # items from both lists will _not_ be included. For a match to occur, the |
| # entire file path (i.e., everything in the url after the bucket name) must |
| # match the regular expression. |
| # |
| # For example, given the input `{bucket_name: "mybucket", include_regex: |
| # ["directory1/.*"], exclude_regex: |
| # ["directory1/excluded.*"]}`: |
| # |
| # * `gs://mybucket/directory1/myfile` will be included |
| # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches |
| # across `/`) |
| # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the |
| # full path doesn't match any items in `include_regex`) |
| # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path |
| # matches an item in `exclude_regex`) |
| # |
| # If `include_regex` is left empty, it will match all files by default |
| # (this is equivalent to setting `include_regex: [".*"]`). |
| # |
| # Some other common use cases: |
| # |
| # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all |
| # files in `mybucket` except for .pdf files |
| # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will |
| # include all files directly under `gs://mybucket/directory/`, without matching |
| # across `/` |
| "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # excluded from the scan. |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| "bucketName": "A String", # The name of a Cloud Storage bucket. Required. |
| "includeRegex": [ # A list of regular expressions matching file paths to include. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # included in the set of files, except for those that also match an item in |
| # `exclude_regex`. Leaving this field empty will match all files by default |
| # (this is equivalent to including `.*` in the list). |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| }, |
| }, |
| "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The |
| # number of bytes scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. |
| # Number of files scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. |
| "fileTypes": [ # List of file type groups to include in the scan. |
| # If empty, all files are scanned and available data format processors |
| # are applied. In addition, the binary content of the selected files |
| # is always scanned as well. |
| "A String", |
| ], |
| }, |
| }, |
| "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. |
| # `inspect_config` will be merged into the values persisted as part of the |
| # template. |
| "actions": [ # Actions to execute at the completion of the job. |
| { # A task to execute on the completion of a job. |
| # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. |
| "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. |
| # OutputStorageConfig. Only a single instance of this action can be |
| # specified. |
| # Compatible with: Inspect, Risk |
| "outputConfig": { # Cloud repository for storing output. |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing |
| # dataset. If table_id is not set a new one will be generated |
| # for you with the following format: |
| # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for |
| # generating the date details. |
| # |
| # For Inspect, each column in an existing output table must have the same |
| # name, type, and mode of a field in the `Finding` object. |
| # |
| # For Risk, an existing output table should be the output of a previous |
| # Risk analysis job run on the same source table, with the same privacy |
| # metric and quasi-identifiers. Risk jobs that analyze the same table but |
| # compute a different privacy metric, or use different sets of |
| # quasi-identifiers, cannot store their results in the same table. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only |
| # used for Inspect and must be unspecified for Risk jobs. Columns are derived |
| # from the `Finding` object. If appending to an existing table, any columns |
| # from the predefined schema that are missing will be added. No columns in |
| # the existing table will be deleted. |
| # |
| # If unspecified, then all available columns will be used for a new table or |
| # an (existing) table with no schema, and no changes will be made to an |
| # existing table that has a schema. |
| }, |
| }, |
| "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's |
| # completion/failure. |
| # completion/failure. |
| }, |
| "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). |
| # Command Center (CSCC Alpha). |
| # This action is only available for projects which are parts of |
| # an organization and whitelisted for the alpha Cloud Security Command |
| # Center. |
| # The action will publish count of finding instances and their info types. |
| # The summary of findings will be persisted in CSCC and are governed by CSCC |
| # service-specific policy, see https://cloud.google.com/terms/service-terms |
| # Only a single instance of this action can be specified. |
| # Compatible with: Inspect |
| }, |
| "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. |
| # message contains a single field, `DlpJobName`, which is equal to the |
| # finished job's |
| # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). |
| # Compatible with: Inspect, Risk |
| "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given |
| # publishing access rights to the DLP API service account executing |
| # the long running DlpJob sending the notifications. |
| # Format is projects/{project}/topics/{topic}. |
| }, |
| }, |
| ], |
| }, |
| } |
| |
| x__xgafv: string, V1 error format. |
| Allowed values |
| 1 - v1 error format |
| 2 - v2 error format |
| |
| Returns: |
| An object of the form: |
| |
| { # Combines all of the information about a DLP job. |
| "errors": [ # A stream of errors encountered running the job. |
| { # Details information about an error encountered during job execution or |
| # the results of an unsuccessful activation of the JobTrigger. |
| # Output only field. |
| "timestamps": [ # The times the error occurred. |
| "A String", |
| ], |
| "details": { # The `Status` type defines a logical error model that is suitable for |
| # different programming environments, including REST APIs and RPC APIs. It is |
| # used by [gRPC](https://github.com/grpc). Each `Status` message contains |
| # three pieces of data: error code, error message, and error details. |
| # |
| # You can find out more about this error model and how to work with it in the |
| # [API Design Guide](https://cloud.google.com/apis/design/errors). |
| "message": "A String", # A developer-facing error message, which should be in English. Any |
| # user-facing error message should be localized and sent in the |
| # google.rpc.Status.details field, or localized by the client. |
| "code": 42, # The status code, which should be an enum value of google.rpc.Code. |
| "details": [ # A list of messages that carry the error details. There is a common set of |
| # message types for APIs to use. |
| { |
| "a_key": "", # Properties of the object. Contains field @type with type URL. |
| }, |
| ], |
| }, |
| }, |
| ], |
| "name": "A String", # The server-assigned name. |
| "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source. |
| "requestedOptions": { # The configuration used for this job. |
| "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of |
| # this run. |
| # to be detected) to be used anywhere you otherwise would normally specify |
| # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates |
| # to learn more. |
| "updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field. |
| "displayName": "A String", # Display name (max 256 chars). |
| "description": "A String", # Short description (max 256 chars). |
| "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "createTime": "A String", # The creation timestamp of a inspectTemplate, output only field. |
| "name": "A String", # The template name. Output only. |
| # |
| # The template will have one of the following formats: |
| # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR |
| # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID` |
| }, |
| "jobConfig": { |
| "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. |
| "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. |
| "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # |
| # A partition ID contains several dimensions: |
| # project ID and namespace ID. |
| "projectId": "A String", # The ID of the project to which the entities belong. |
| "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. |
| }, |
| "kind": { # A representation of a Datastore kind. # The kind to process. |
| "name": "A String", # The name of the kind. |
| }, |
| }, |
| "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. |
| "excludedFields": [ # References to fields excluded from scanning. This allows you to skip |
| # inspection of entire columns which you know have no findings. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the |
| # rest of the rows are omitted. If not set, or if set to 0, all rows will be |
| # scanned. Only one of rows_limit and rows_limit_percent can be specified. |
| # Cannot be used in conjunction with TimespanConfig. |
| "sampleMethod": "A String", |
| "identifyingFields": [ # References to fields uniquely identifying rows within the table. |
| # Nested fields in the format, like `person.birthdate.year`, are allowed. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows |
| # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and |
| # 100 means no limit. Defaults to 0. Only one of rows_limit and |
| # rows_limit_percent can be specified. Cannot be used in conjunction with |
| # TimespanConfig. |
| "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "timespanConfig": { # Configuration of the timespan of the items to include in scanning. |
| # Currently only supported when inspecting Google Cloud Storage and BigQuery. |
| "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. |
| # Used for data sources like Datastore or BigQuery. |
| # If not specified for BigQuery, table last modification timestamp |
| # is checked against given time span. |
| # The valid data types of the timestamp field are: |
| # for BigQuery - timestamp, date, datetime; |
| # for Datastore - timestamp. |
| # Datastore entity will be scanned if the timestamp property does not exist |
| # or its value is empty or invalid. |
| "name": "A String", # Name describing the field. |
| }, |
| "endTime": "A String", # Exclude files or rows newer than this value. |
| # If set to zero, no upper time limit is applied. |
| "startTime": "A String", # Exclude files or rows older than this value. |
| "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out |
| # a valid start_time to avoid scanning files that have not been modified |
| # since the last time the JobTrigger executed. This will be based on the |
| # time of the execution of the last run of the JobTrigger. |
| }, |
| "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. |
| # bucket. |
| "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger |
| # than this value then the rest of the bytes are omitted. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "sampleMethod": "A String", |
| "fileSet": { # Set of files to scan. # The set of one or more files to scan. |
| "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format |
| # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. |
| # |
| # If the url ends in a trailing slash, the bucket or directory represented |
| # by the url will be scanned non-recursively (content in sub-directories |
| # will not be scanned). This means that `gs://mybucket/` is equivalent to |
| # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to |
| # `gs://mybucket/directory/*`. |
| # |
| # Exactly one of `url` or `regex_file_set` must be set. |
| "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or |
| # `regex_file_set` must be set. |
| # expressions are used to allow fine-grained control over which files in the |
| # bucket to include. |
| # |
| # Included files are those that match at least one item in `include_regex` and |
| # do not match any items in `exclude_regex`. Note that a file that matches |
| # items from both lists will _not_ be included. For a match to occur, the |
| # entire file path (i.e., everything in the url after the bucket name) must |
| # match the regular expression. |
| # |
| # For example, given the input `{bucket_name: "mybucket", include_regex: |
| # ["directory1/.*"], exclude_regex: |
| # ["directory1/excluded.*"]}`: |
| # |
| # * `gs://mybucket/directory1/myfile` will be included |
| # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches |
| # across `/`) |
| # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the |
| # full path doesn't match any items in `include_regex`) |
| # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path |
| # matches an item in `exclude_regex`) |
| # |
| # If `include_regex` is left empty, it will match all files by default |
| # (this is equivalent to setting `include_regex: [".*"]`). |
| # |
| # Some other common use cases: |
| # |
| # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all |
| # files in `mybucket` except for .pdf files |
| # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will |
| # include all files directly under `gs://mybucket/directory/`, without matching |
| # across `/` |
| "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # excluded from the scan. |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| "bucketName": "A String", # The name of a Cloud Storage bucket. Required. |
| "includeRegex": [ # A list of regular expressions matching file paths to include. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # included in the set of files, except for those that also match an item in |
| # `exclude_regex`. Leaving this field empty will match all files by default |
| # (this is equivalent to including `.*` in the list). |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| }, |
| }, |
| "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The |
| # number of bytes scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. |
| # Number of files scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. |
| "fileTypes": [ # List of file type groups to include in the scan. |
| # If empty, all files are scanned and available data format processors |
| # are applied. In addition, the binary content of the selected files |
| # is always scanned as well. |
| "A String", |
| ], |
| }, |
| }, |
| "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. |
| # `inspect_config` will be merged into the values persisted as part of the |
| # template. |
| "actions": [ # Actions to execute at the completion of the job. |
| { # A task to execute on the completion of a job. |
| # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. |
| "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. |
| # OutputStorageConfig. Only a single instance of this action can be |
| # specified. |
| # Compatible with: Inspect, Risk |
| "outputConfig": { # Cloud repository for storing output. |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing |
| # dataset. If table_id is not set a new one will be generated |
| # for you with the following format: |
| # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for |
| # generating the date details. |
| # |
| # For Inspect, each column in an existing output table must have the same |
| # name, type, and mode of a field in the `Finding` object. |
| # |
| # For Risk, an existing output table should be the output of a previous |
| # Risk analysis job run on the same source table, with the same privacy |
| # metric and quasi-identifiers. Risk jobs that analyze the same table but |
| # compute a different privacy metric, or use different sets of |
| # quasi-identifiers, cannot store their results in the same table. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only |
| # used for Inspect and must be unspecified for Risk jobs. Columns are derived |
| # from the `Finding` object. If appending to an existing table, any columns |
| # from the predefined schema that are missing will be added. No columns in |
| # the existing table will be deleted. |
| # |
| # If unspecified, then all available columns will be used for a new table or |
| # an (existing) table with no schema, and no changes will be made to an |
| # existing table that has a schema. |
| }, |
| }, |
| "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's |
| # completion/failure. |
| # completion/failure. |
| }, |
| "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). |
| # Command Center (CSCC Alpha). |
| # This action is only available for projects which are parts of |
| # an organization and whitelisted for the alpha Cloud Security Command |
| # Center. |
| # The action will publish count of finding instances and their info types. |
| # The summary of findings will be persisted in CSCC and are governed by CSCC |
| # service-specific policy, see https://cloud.google.com/terms/service-terms |
| # Only a single instance of this action can be specified. |
| # Compatible with: Inspect |
| }, |
| "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. |
| # message contains a single field, `DlpJobName`, which is equal to the |
| # finished job's |
| # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). |
| # Compatible with: Inspect, Risk |
| "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given |
| # publishing access rights to the DLP API service account executing |
| # the long running DlpJob sending the notifications. |
| # Format is projects/{project}/topics/{topic}. |
| }, |
| }, |
| ], |
| }, |
| }, |
| "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job. |
| "infoTypeStats": [ # Statistics of how many instances of each info type were found during |
| # inspect job. |
| { # Statistics regarding a specific InfoType. |
| "count": "A String", # Number of findings for this infoType. |
| "infoType": { # Type of information detected by the API. # The type of finding this stat is for. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| }, |
| ], |
| "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process. |
| "processedBytes": "A String", # Total size in bytes that were processed. |
| }, |
| }, |
| "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source. |
| "numericalStatsResult": { # Result of the numerical stats computation. |
| "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal |
| # sized buckets. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an |
| # estimation, not exact values. |
| "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value |
| # doesn't correspond to any such interval, the associated frequency is |
| # zero. For example, the following records: |
| # {min_anonymity: 1, max_anonymity: 1, frequency: 17} |
| # {min_anonymity: 2, max_anonymity: 3, frequency: 42} |
| # {min_anonymity: 5, max_anonymity: 10, frequency: 99} |
| # mean that there are no record with an estimated anonymity of 4, 5, or |
| # larger than 10. |
| { # A KMapEstimationHistogramBucket message with the following values: |
| # min_anonymity: 3 |
| # max_anonymity: 5 |
| # frequency: 42 |
| # means that there are 42 records whose quasi-identifier values correspond |
| # to 3, 4 or 5 people in the overlying population. An important particular |
| # case is when min_anonymity = max_anonymity = 1: the frequency field then |
| # corresponds to the number of uniquely identifiable records. |
| "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total |
| # number of classes returned per bucket is capped at 20. |
| { # A tuple of values for the quasi-identifier columns. |
| "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values. |
| "quasiIdsValues": [ # The quasi-identifier values. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| }, |
| ], |
| "minAnonymity": "A String", # Always positive. |
| "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. |
| "maxAnonymity": "A String", # Always greater than or equal to min_anonymity. |
| "bucketSize": "A String", # Number of records within these anonymity bounds. |
| }, |
| ], |
| }, |
| "kAnonymityResult": { # Result of the k-anonymity computation. |
| "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes. |
| { |
| "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of |
| # classes returned per bucket is capped at 20. |
| { # The set of columns' values that share the same ldiversity value |
| "quasiIdsValues": [ # Set of values defining the equivalence class. One value per |
| # quasi-identifier column in the original KAnonymity metric message. |
| # The order is always the same as the original request. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the |
| # above set of values. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. |
| "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket. |
| "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket. |
| "bucketSize": "A String", # Total number of equivalence classes in this bucket. |
| }, |
| ], |
| }, |
| "lDiversityResult": { # Result of the l-diversity computation. |
| "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies. |
| { |
| "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of |
| # classes returned per bucket is capped at 20. |
| { # The set of columns' values that share the same ldiversity value. |
| "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class. |
| "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence |
| # class. The order is always the same as the original request. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "topSensitiveValues": [ # Estimated frequencies of top sensitive values. |
| { # A value of a field, including its frequency. |
| "count": "A String", # How many times the value is contained in the field. |
| "value": { # Set of primitive values supported by the system. # A value contained in the field in question. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| ], |
| "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. |
| "bucketSize": "A String", # Total number of equivalence classes in this bucket. |
| "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence |
| # classes in this bucket. |
| "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence |
| # classes in this bucket. |
| }, |
| ], |
| }, |
| "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. |
| "numericalStatsConfig": { # Compute numerical stats over an individual column, including |
| # min, max, and quantiles. |
| "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are |
| # integer, float, date, datetime, timestamp, time. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what |
| # is called "journalist risk" in the literature, except the attack dataset is |
| # statistically modeled instead of being perfectly known. This can be done |
| # using publicly available data (like the US Census), or using a custom |
| # statistical model (indicated as one or several BigQuery tables), or by |
| # extrapolating from the distribution of values in the input dataset. |
| # A column with a semantic tag attached. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the |
| # same tag. [required] |
| { |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers column must appear in exactly one column |
| # of one auxiliary table. |
| { # An auxiliary table contains statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. |
| "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are |
| # defined for the l-diversity computation. When multiple fields are |
| # specified, they are considered a single composite key. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to |
| # figure out that one given individual appears in a de-identified dataset. |
| # Similarly to the k-map metric, we cannot compute δ-presence exactly without |
| # knowing the attack dataset, so we use a statistical model instead. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the |
| # same tag. [required] |
| { # A column with a semantic tag attached. |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers field must appear in exactly one |
| # field of one auxiliary table. |
| { # An auxiliary table containing statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "categoricalStatsConfig": { # Compute numerical stats over an individual column, including |
| # number of distinct values and value count distribution. |
| "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are |
| # supported except for arrays and structs. However, it may be more |
| # informative to use NumericalStats when the field type is supported, |
| # depending on the data. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. |
| "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a |
| # single individual. If the same entity_id is associated to multiple |
| # quasi-identifier tuples over distinct rows, we consider the entire |
| # collection of tuples as the composite quasi-identifier. This collection |
| # is a multiset: the order in which the different tuples appear in the |
| # dataset is ignored, but their frequency is taken into account. |
| # |
| # Important note: a maximum of 1000 rows can be associated to a single |
| # entity ID. If more rows are associated with the same entity ID, some |
| # might be ignored. |
| # single person. For example, in medical records the `EntityId` might be a |
| # patient identifier, or for financial records it might be an account |
| # identifier. This message is used when generalizations or analysis must take |
| # into account that multiple rows correspond to the same entity. |
| "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are |
| # specified, they are considered a single composite key. Structs and |
| # repeated data types are not supported; however, nested fields are |
| # supported so long as they are not structs themselves or nested within |
| # a repeated field. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| }, |
| "categoricalStatsResult": { # Result of the categorical stats computation. |
| "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column. |
| { |
| "bucketValues": [ # Sample of value frequencies in this bucket. The total number of |
| # values returned per bucket is capped at 20. |
| { # A value of a field, including its frequency. |
| "count": "A String", # How many times the value is contained in the field. |
| "value": { # Set of primitive values supported by the system. # A value contained in the field in question. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct values in this bucket. |
| "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket. |
| "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket. |
| "bucketSize": "A String", # Total number of values in this bucket. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an |
| # estimation, not exact values. |
| "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a |
| # value doesn't correspond to any such interval, the associated frequency |
| # is zero. For example, the following records: |
| # {min_probability: 0, max_probability: 0.1, frequency: 17} |
| # {min_probability: 0.2, max_probability: 0.3, frequency: 42} |
| # {min_probability: 0.3, max_probability: 0.4, frequency: 99} |
| # mean that there are no record with an estimated probability in [0.1, 0.2) |
| # nor larger or equal to 0.4. |
| { # A DeltaPresenceEstimationHistogramBucket message with the following |
| # values: |
| # min_probability: 0.1 |
| # max_probability: 0.2 |
| # frequency: 42 |
| # means that there are 42 records for which δ is in [0.1, 0.2). An |
| # important particular case is when min_probability = max_probability = 1: |
| # then, every individual who shares this quasi-identifier combination is in |
| # the dataset. |
| "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total |
| # number of classes returned per bucket is capped at 20. |
| { # A tuple of values for the quasi-identifier columns. |
| "quasiIdsValues": [ # The quasi-identifier values. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these |
| # quasi-identifier values is in the dataset. This value, typically called |
| # δ, is the ratio between the number of records in the dataset with these |
| # quasi-identifier values, and the total number of individuals (inside |
| # *and* outside the dataset) with these quasi-identifier values. |
| # For example, if there are 15 individuals in the dataset who share the |
| # same quasi-identifier values, and an estimated 100 people in the entire |
| # population with these values, then δ is 0.15. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. |
| "bucketSize": "A String", # Number of records within these probability bounds. |
| "maxProbability": 3.14, # Always greater than or equal to min_probability. |
| "minProbability": 3.14, # Between 0 and 1. |
| }, |
| ], |
| }, |
| "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "state": "A String", # State of a job. |
| "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that |
| # instantiated the job. |
| "startTime": "A String", # Time when the job started. |
| "endTime": "A String", # Time when the job finished. |
| "type": "A String", # The type of job. |
| "createTime": "A String", # Time when the job was created. |
| }</pre> |
| </div> |
| |
| <div class="method"> |
| <code class="details" id="delete">delete(name, x__xgafv=None)</code> |
| <pre>Deletes a long-running DlpJob. This method indicates that the client is |
| no longer interested in the DlpJob result. The job will be cancelled if |
| possible. |
| See https://cloud.google.com/dlp/docs/inspecting-storage and |
| https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. |
| |
| Args: |
| name: string, The name of the DlpJob resource to be deleted. (required) |
| x__xgafv: string, V1 error format. |
| Allowed values |
| 1 - v1 error format |
| 2 - v2 error format |
| |
| Returns: |
| An object of the form: |
| |
| { # A generic empty message that you can re-use to avoid defining duplicated |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }</pre> |
| </div> |
| |
| <div class="method"> |
| <code class="details" id="get">get(name, x__xgafv=None)</code> |
| <pre>Gets the latest state of a long-running DlpJob. |
| See https://cloud.google.com/dlp/docs/inspecting-storage and |
| https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. |
| |
| Args: |
| name: string, The name of the DlpJob resource. (required) |
| x__xgafv: string, V1 error format. |
| Allowed values |
| 1 - v1 error format |
| 2 - v2 error format |
| |
| Returns: |
| An object of the form: |
| |
| { # Combines all of the information about a DLP job. |
| "errors": [ # A stream of errors encountered running the job. |
| { # Details information about an error encountered during job execution or |
| # the results of an unsuccessful activation of the JobTrigger. |
| # Output only field. |
| "timestamps": [ # The times the error occurred. |
| "A String", |
| ], |
| "details": { # The `Status` type defines a logical error model that is suitable for |
| # different programming environments, including REST APIs and RPC APIs. It is |
| # used by [gRPC](https://github.com/grpc). Each `Status` message contains |
| # three pieces of data: error code, error message, and error details. |
| # |
| # You can find out more about this error model and how to work with it in the |
| # [API Design Guide](https://cloud.google.com/apis/design/errors). |
| "message": "A String", # A developer-facing error message, which should be in English. Any |
| # user-facing error message should be localized and sent in the |
| # google.rpc.Status.details field, or localized by the client. |
| "code": 42, # The status code, which should be an enum value of google.rpc.Code. |
| "details": [ # A list of messages that carry the error details. There is a common set of |
| # message types for APIs to use. |
| { |
| "a_key": "", # Properties of the object. Contains field @type with type URL. |
| }, |
| ], |
| }, |
| }, |
| ], |
| "name": "A String", # The server-assigned name. |
| "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source. |
| "requestedOptions": { # The configuration used for this job. |
| "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of |
| # this run. |
| # to be detected) to be used anywhere you otherwise would normally specify |
| # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates |
| # to learn more. |
| "updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field. |
| "displayName": "A String", # Display name (max 256 chars). |
| "description": "A String", # Short description (max 256 chars). |
| "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "createTime": "A String", # The creation timestamp of a inspectTemplate, output only field. |
| "name": "A String", # The template name. Output only. |
| # |
| # The template will have one of the following formats: |
| # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR |
| # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID` |
| }, |
| "jobConfig": { |
| "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. |
| "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. |
| "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # |
| # A partition ID contains several dimensions: |
| # project ID and namespace ID. |
| "projectId": "A String", # The ID of the project to which the entities belong. |
| "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. |
| }, |
| "kind": { # A representation of a Datastore kind. # The kind to process. |
| "name": "A String", # The name of the kind. |
| }, |
| }, |
| "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. |
| "excludedFields": [ # References to fields excluded from scanning. This allows you to skip |
| # inspection of entire columns which you know have no findings. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the |
| # rest of the rows are omitted. If not set, or if set to 0, all rows will be |
| # scanned. Only one of rows_limit and rows_limit_percent can be specified. |
| # Cannot be used in conjunction with TimespanConfig. |
| "sampleMethod": "A String", |
| "identifyingFields": [ # References to fields uniquely identifying rows within the table. |
| # Nested fields in the format, like `person.birthdate.year`, are allowed. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows |
| # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and |
| # 100 means no limit. Defaults to 0. Only one of rows_limit and |
| # rows_limit_percent can be specified. Cannot be used in conjunction with |
| # TimespanConfig. |
| "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "timespanConfig": { # Configuration of the timespan of the items to include in scanning. |
| # Currently only supported when inspecting Google Cloud Storage and BigQuery. |
| "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. |
| # Used for data sources like Datastore or BigQuery. |
| # If not specified for BigQuery, table last modification timestamp |
| # is checked against given time span. |
| # The valid data types of the timestamp field are: |
| # for BigQuery - timestamp, date, datetime; |
| # for Datastore - timestamp. |
| # Datastore entity will be scanned if the timestamp property does not exist |
| # or its value is empty or invalid. |
| "name": "A String", # Name describing the field. |
| }, |
| "endTime": "A String", # Exclude files or rows newer than this value. |
| # If set to zero, no upper time limit is applied. |
| "startTime": "A String", # Exclude files or rows older than this value. |
| "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out |
| # a valid start_time to avoid scanning files that have not been modified |
| # since the last time the JobTrigger executed. This will be based on the |
| # time of the execution of the last run of the JobTrigger. |
| }, |
| "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. |
| # bucket. |
| "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger |
| # than this value then the rest of the bytes are omitted. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "sampleMethod": "A String", |
| "fileSet": { # Set of files to scan. # The set of one or more files to scan. |
| "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format |
| # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. |
| # |
| # If the url ends in a trailing slash, the bucket or directory represented |
| # by the url will be scanned non-recursively (content in sub-directories |
| # will not be scanned). This means that `gs://mybucket/` is equivalent to |
| # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to |
| # `gs://mybucket/directory/*`. |
| # |
| # Exactly one of `url` or `regex_file_set` must be set. |
| "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or |
| # `regex_file_set` must be set. |
| # expressions are used to allow fine-grained control over which files in the |
| # bucket to include. |
| # |
| # Included files are those that match at least one item in `include_regex` and |
| # do not match any items in `exclude_regex`. Note that a file that matches |
| # items from both lists will _not_ be included. For a match to occur, the |
| # entire file path (i.e., everything in the url after the bucket name) must |
| # match the regular expression. |
| # |
| # For example, given the input `{bucket_name: "mybucket", include_regex: |
| # ["directory1/.*"], exclude_regex: |
| # ["directory1/excluded.*"]}`: |
| # |
| # * `gs://mybucket/directory1/myfile` will be included |
| # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches |
| # across `/`) |
| # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the |
| # full path doesn't match any items in `include_regex`) |
| # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path |
| # matches an item in `exclude_regex`) |
| # |
| # If `include_regex` is left empty, it will match all files by default |
| # (this is equivalent to setting `include_regex: [".*"]`). |
| # |
| # Some other common use cases: |
| # |
| # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all |
| # files in `mybucket` except for .pdf files |
| # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will |
| # include all files directly under `gs://mybucket/directory/`, without matching |
| # across `/` |
| "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # excluded from the scan. |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| "bucketName": "A String", # The name of a Cloud Storage bucket. Required. |
| "includeRegex": [ # A list of regular expressions matching file paths to include. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # included in the set of files, except for those that also match an item in |
| # `exclude_regex`. Leaving this field empty will match all files by default |
| # (this is equivalent to including `.*` in the list). |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| }, |
| }, |
| "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The |
| # number of bytes scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. |
| # Number of files scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. |
| "fileTypes": [ # List of file type groups to include in the scan. |
| # If empty, all files are scanned and available data format processors |
| # are applied. In addition, the binary content of the selected files |
| # is always scanned as well. |
| "A String", |
| ], |
| }, |
| }, |
| "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. |
| # `inspect_config` will be merged into the values persisted as part of the |
| # template. |
| "actions": [ # Actions to execute at the completion of the job. |
| { # A task to execute on the completion of a job. |
| # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. |
| "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. |
| # OutputStorageConfig. Only a single instance of this action can be |
| # specified. |
| # Compatible with: Inspect, Risk |
| "outputConfig": { # Cloud repository for storing output. |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing |
| # dataset. If table_id is not set a new one will be generated |
| # for you with the following format: |
| # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for |
| # generating the date details. |
| # |
| # For Inspect, each column in an existing output table must have the same |
| # name, type, and mode of a field in the `Finding` object. |
| # |
| # For Risk, an existing output table should be the output of a previous |
| # Risk analysis job run on the same source table, with the same privacy |
| # metric and quasi-identifiers. Risk jobs that analyze the same table but |
| # compute a different privacy metric, or use different sets of |
| # quasi-identifiers, cannot store their results in the same table. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only |
| # used for Inspect and must be unspecified for Risk jobs. Columns are derived |
| # from the `Finding` object. If appending to an existing table, any columns |
| # from the predefined schema that are missing will be added. No columns in |
| # the existing table will be deleted. |
| # |
| # If unspecified, then all available columns will be used for a new table or |
| # an (existing) table with no schema, and no changes will be made to an |
| # existing table that has a schema. |
| }, |
| }, |
| "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's |
| # completion/failure. |
| # completion/failure. |
| }, |
| "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). |
| # Command Center (CSCC Alpha). |
| # This action is only available for projects which are parts of |
| # an organization and whitelisted for the alpha Cloud Security Command |
| # Center. |
| # The action will publish count of finding instances and their info types. |
| # The summary of findings will be persisted in CSCC and are governed by CSCC |
| # service-specific policy, see https://cloud.google.com/terms/service-terms |
| # Only a single instance of this action can be specified. |
| # Compatible with: Inspect |
| }, |
| "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. |
| # message contains a single field, `DlpJobName`, which is equal to the |
| # finished job's |
| # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). |
| # Compatible with: Inspect, Risk |
| "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given |
| # publishing access rights to the DLP API service account executing |
| # the long running DlpJob sending the notifications. |
| # Format is projects/{project}/topics/{topic}. |
| }, |
| }, |
| ], |
| }, |
| }, |
| "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job. |
| "infoTypeStats": [ # Statistics of how many instances of each info type were found during |
| # inspect job. |
| { # Statistics regarding a specific InfoType. |
| "count": "A String", # Number of findings for this infoType. |
| "infoType": { # Type of information detected by the API. # The type of finding this stat is for. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| }, |
| ], |
| "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process. |
| "processedBytes": "A String", # Total size in bytes that were processed. |
| }, |
| }, |
| "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source. |
| "numericalStatsResult": { # Result of the numerical stats computation. |
| "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal |
| # sized buckets. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an |
| # estimation, not exact values. |
| "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value |
| # doesn't correspond to any such interval, the associated frequency is |
| # zero. For example, the following records: |
| # {min_anonymity: 1, max_anonymity: 1, frequency: 17} |
| # {min_anonymity: 2, max_anonymity: 3, frequency: 42} |
| # {min_anonymity: 5, max_anonymity: 10, frequency: 99} |
| # mean that there are no record with an estimated anonymity of 4, 5, or |
| # larger than 10. |
| { # A KMapEstimationHistogramBucket message with the following values: |
| # min_anonymity: 3 |
| # max_anonymity: 5 |
| # frequency: 42 |
| # means that there are 42 records whose quasi-identifier values correspond |
| # to 3, 4 or 5 people in the overlying population. An important particular |
| # case is when min_anonymity = max_anonymity = 1: the frequency field then |
| # corresponds to the number of uniquely identifiable records. |
| "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total |
| # number of classes returned per bucket is capped at 20. |
| { # A tuple of values for the quasi-identifier columns. |
| "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values. |
| "quasiIdsValues": [ # The quasi-identifier values. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| }, |
| ], |
| "minAnonymity": "A String", # Always positive. |
| "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. |
| "maxAnonymity": "A String", # Always greater than or equal to min_anonymity. |
| "bucketSize": "A String", # Number of records within these anonymity bounds. |
| }, |
| ], |
| }, |
| "kAnonymityResult": { # Result of the k-anonymity computation. |
| "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes. |
| { |
| "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of |
| # classes returned per bucket is capped at 20. |
| { # The set of columns' values that share the same ldiversity value |
| "quasiIdsValues": [ # Set of values defining the equivalence class. One value per |
| # quasi-identifier column in the original KAnonymity metric message. |
| # The order is always the same as the original request. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the |
| # above set of values. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. |
| "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket. |
| "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket. |
| "bucketSize": "A String", # Total number of equivalence classes in this bucket. |
| }, |
| ], |
| }, |
| "lDiversityResult": { # Result of the l-diversity computation. |
| "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies. |
| { |
| "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of |
| # classes returned per bucket is capped at 20. |
| { # The set of columns' values that share the same ldiversity value. |
| "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class. |
| "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence |
| # class. The order is always the same as the original request. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "topSensitiveValues": [ # Estimated frequencies of top sensitive values. |
| { # A value of a field, including its frequency. |
| "count": "A String", # How many times the value is contained in the field. |
| "value": { # Set of primitive values supported by the system. # A value contained in the field in question. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| ], |
| "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. |
| "bucketSize": "A String", # Total number of equivalence classes in this bucket. |
| "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence |
| # classes in this bucket. |
| "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence |
| # classes in this bucket. |
| }, |
| ], |
| }, |
| "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. |
| "numericalStatsConfig": { # Compute numerical stats over an individual column, including |
| # min, max, and quantiles. |
| "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are |
| # integer, float, date, datetime, timestamp, time. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what |
| # is called "journalist risk" in the literature, except the attack dataset is |
| # statistically modeled instead of being perfectly known. This can be done |
| # using publicly available data (like the US Census), or using a custom |
| # statistical model (indicated as one or several BigQuery tables), or by |
| # extrapolating from the distribution of values in the input dataset. |
| # A column with a semantic tag attached. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the |
| # same tag. [required] |
| { |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers column must appear in exactly one column |
| # of one auxiliary table. |
| { # An auxiliary table contains statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. |
| "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are |
| # defined for the l-diversity computation. When multiple fields are |
| # specified, they are considered a single composite key. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to |
| # figure out that one given individual appears in a de-identified dataset. |
| # Similarly to the k-map metric, we cannot compute δ-presence exactly without |
| # knowing the attack dataset, so we use a statistical model instead. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the |
| # same tag. [required] |
| { # A column with a semantic tag attached. |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers field must appear in exactly one |
| # field of one auxiliary table. |
| { # An auxiliary table containing statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "categoricalStatsConfig": { # Compute numerical stats over an individual column, including |
| # number of distinct values and value count distribution. |
| "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are |
| # supported except for arrays and structs. However, it may be more |
| # informative to use NumericalStats when the field type is supported, |
| # depending on the data. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. |
| "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a |
| # single individual. If the same entity_id is associated to multiple |
| # quasi-identifier tuples over distinct rows, we consider the entire |
| # collection of tuples as the composite quasi-identifier. This collection |
| # is a multiset: the order in which the different tuples appear in the |
| # dataset is ignored, but their frequency is taken into account. |
| # |
| # Important note: a maximum of 1000 rows can be associated to a single |
| # entity ID. If more rows are associated with the same entity ID, some |
| # might be ignored. |
| # single person. For example, in medical records the `EntityId` might be a |
| # patient identifier, or for financial records it might be an account |
| # identifier. This message is used when generalizations or analysis must take |
| # into account that multiple rows correspond to the same entity. |
| "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are |
| # specified, they are considered a single composite key. Structs and |
| # repeated data types are not supported; however, nested fields are |
| # supported so long as they are not structs themselves or nested within |
| # a repeated field. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| }, |
| "categoricalStatsResult": { # Result of the categorical stats computation. |
| "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column. |
| { |
| "bucketValues": [ # Sample of value frequencies in this bucket. The total number of |
| # values returned per bucket is capped at 20. |
| { # A value of a field, including its frequency. |
| "count": "A String", # How many times the value is contained in the field. |
| "value": { # Set of primitive values supported by the system. # A value contained in the field in question. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct values in this bucket. |
| "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket. |
| "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket. |
| "bucketSize": "A String", # Total number of values in this bucket. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an |
| # estimation, not exact values. |
| "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a |
| # value doesn't correspond to any such interval, the associated frequency |
| # is zero. For example, the following records: |
| # {min_probability: 0, max_probability: 0.1, frequency: 17} |
| # {min_probability: 0.2, max_probability: 0.3, frequency: 42} |
| # {min_probability: 0.3, max_probability: 0.4, frequency: 99} |
| # mean that there are no record with an estimated probability in [0.1, 0.2) |
| # nor larger or equal to 0.4. |
| { # A DeltaPresenceEstimationHistogramBucket message with the following |
| # values: |
| # min_probability: 0.1 |
| # max_probability: 0.2 |
| # frequency: 42 |
| # means that there are 42 records for which δ is in [0.1, 0.2). An |
| # important particular case is when min_probability = max_probability = 1: |
| # then, every individual who shares this quasi-identifier combination is in |
| # the dataset. |
| "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total |
| # number of classes returned per bucket is capped at 20. |
| { # A tuple of values for the quasi-identifier columns. |
| "quasiIdsValues": [ # The quasi-identifier values. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these |
| # quasi-identifier values is in the dataset. This value, typically called |
| # δ, is the ratio between the number of records in the dataset with these |
| # quasi-identifier values, and the total number of individuals (inside |
| # *and* outside the dataset) with these quasi-identifier values. |
| # For example, if there are 15 individuals in the dataset who share the |
| # same quasi-identifier values, and an estimated 100 people in the entire |
| # population with these values, then δ is 0.15. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. |
| "bucketSize": "A String", # Number of records within these probability bounds. |
| "maxProbability": 3.14, # Always greater than or equal to min_probability. |
| "minProbability": 3.14, # Between 0 and 1. |
| }, |
| ], |
| }, |
| "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "state": "A String", # State of a job. |
| "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that |
| # instantiated the job. |
| "startTime": "A String", # Time when the job started. |
| "endTime": "A String", # Time when the job finished. |
| "type": "A String", # The type of job. |
| "createTime": "A String", # Time when the job was created. |
| }</pre> |
| </div> |
| |
| <div class="method"> |
| <code class="details" id="list">list(parent, orderBy=None, type=None, pageSize=None, pageToken=None, x__xgafv=None, filter=None)</code> |
| <pre>Lists DlpJobs that match the specified filter in the request. |
| See https://cloud.google.com/dlp/docs/inspecting-storage and |
| https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. |
| |
| Args: |
| parent: string, The parent resource name, for example projects/my-project-id. (required) |
| orderBy: string, Optional comma separated list of fields to order by, |
| followed by `asc` or `desc` postfix. This list is case-insensitive, |
| default sorting order is ascending, redundant space characters are |
| insignificant. |
| |
| Example: `name asc, end_time asc, create_time desc` |
| |
| Supported fields are: |
| |
| - `create_time`: corresponds to time the job was created. |
| - `end_time`: corresponds to time the job ended. |
| - `name`: corresponds to job's name. |
| - `state`: corresponds to `state` |
| type: string, The type of job. Defaults to `DlpJobType.INSPECT` |
| pageSize: integer, The standard list page size. |
| pageToken: string, The standard list page token. |
| x__xgafv: string, V1 error format. |
| Allowed values |
| 1 - v1 error format |
| 2 - v2 error format |
| filter: string, Optional. Allows filtering. |
| |
| Supported syntax: |
| |
| * Filter expressions are made up of one or more restrictions. |
| * Restrictions can be combined by `AND` or `OR` logical operators. A |
| sequence of restrictions implicitly uses `AND`. |
| * A restriction has the form of `<field> <operator> <value>`. |
| * Supported fields/values for inspect jobs: |
| - `state` - PENDING|RUNNING|CANCELED|FINISHED|FAILED |
| - `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY |
| - `trigger_name` - The resource name of the trigger that created job. |
| - 'end_time` - Corresponds to time the job finished. |
| - 'start_time` - Corresponds to time the job finished. |
| * Supported fields for risk analysis jobs: |
| - `state` - RUNNING|CANCELED|FINISHED|FAILED |
| - 'end_time` - Corresponds to time the job finished. |
| - 'start_time` - Corresponds to time the job finished. |
| * The operator must be `=` or `!=`. |
| |
| Examples: |
| |
| * inspected_storage = cloud_storage AND state = done |
| * inspected_storage = cloud_storage OR inspected_storage = bigquery |
| * inspected_storage = cloud_storage AND (state = done OR state = canceled) |
| * end_time > \"2017-12-12T00:00:00+00:00\" |
| |
| The length of this field should be no more than 500 characters. |
| |
| Returns: |
| An object of the form: |
| |
| { # The response message for listing DLP jobs. |
| "nextPageToken": "A String", # The standard List next-page token. |
| "jobs": [ # A list of DlpJobs that matches the specified filter in the request. |
| { # Combines all of the information about a DLP job. |
| "errors": [ # A stream of errors encountered running the job. |
| { # Details information about an error encountered during job execution or |
| # the results of an unsuccessful activation of the JobTrigger. |
| # Output only field. |
| "timestamps": [ # The times the error occurred. |
| "A String", |
| ], |
| "details": { # The `Status` type defines a logical error model that is suitable for |
| # different programming environments, including REST APIs and RPC APIs. It is |
| # used by [gRPC](https://github.com/grpc). Each `Status` message contains |
| # three pieces of data: error code, error message, and error details. |
| # |
| # You can find out more about this error model and how to work with it in the |
| # [API Design Guide](https://cloud.google.com/apis/design/errors). |
| "message": "A String", # A developer-facing error message, which should be in English. Any |
| # user-facing error message should be localized and sent in the |
| # google.rpc.Status.details field, or localized by the client. |
| "code": 42, # The status code, which should be an enum value of google.rpc.Code. |
| "details": [ # A list of messages that carry the error details. There is a common set of |
| # message types for APIs to use. |
| { |
| "a_key": "", # Properties of the object. Contains field @type with type URL. |
| }, |
| ], |
| }, |
| }, |
| ], |
| "name": "A String", # The server-assigned name. |
| "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source. |
| "requestedOptions": { # The configuration used for this job. |
| "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of |
| # this run. |
| # to be detected) to be used anywhere you otherwise would normally specify |
| # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates |
| # to learn more. |
| "updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field. |
| "displayName": "A String", # Display name (max 256 chars). |
| "description": "A String", # Short description (max 256 chars). |
| "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "createTime": "A String", # The creation timestamp of a inspectTemplate, output only field. |
| "name": "A String", # The template name. Output only. |
| # |
| # The template will have one of the following formats: |
| # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR |
| # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID` |
| }, |
| "jobConfig": { |
| "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. |
| "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. |
| "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # A partition ID identifies a grouping of entities. The grouping is always |
| # by project and namespace, however the namespace ID may be empty. |
| # |
| # A partition ID contains several dimensions: |
| # project ID and namespace ID. |
| "projectId": "A String", # The ID of the project to which the entities belong. |
| "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. |
| }, |
| "kind": { # A representation of a Datastore kind. # The kind to process. |
| "name": "A String", # The name of the kind. |
| }, |
| }, |
| "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. |
| "excludedFields": [ # References to fields excluded from scanning. This allows you to skip |
| # inspection of entire columns which you know have no findings. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the |
| # rest of the rows are omitted. If not set, or if set to 0, all rows will be |
| # scanned. Only one of rows_limit and rows_limit_percent can be specified. |
| # Cannot be used in conjunction with TimespanConfig. |
| "sampleMethod": "A String", |
| "identifyingFields": [ # References to fields uniquely identifying rows within the table. |
| # Nested fields in the format, like `person.birthdate.year`, are allowed. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows |
| # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and |
| # 100 means no limit. Defaults to 0. Only one of rows_limit and |
| # rows_limit_percent can be specified. Cannot be used in conjunction with |
| # TimespanConfig. |
| "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "timespanConfig": { # Configuration of the timespan of the items to include in scanning. |
| # Currently only supported when inspecting Google Cloud Storage and BigQuery. |
| "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. |
| # Used for data sources like Datastore or BigQuery. |
| # If not specified for BigQuery, table last modification timestamp |
| # is checked against given time span. |
| # The valid data types of the timestamp field are: |
| # for BigQuery - timestamp, date, datetime; |
| # for Datastore - timestamp. |
| # Datastore entity will be scanned if the timestamp property does not exist |
| # or its value is empty or invalid. |
| "name": "A String", # Name describing the field. |
| }, |
| "endTime": "A String", # Exclude files or rows newer than this value. |
| # If set to zero, no upper time limit is applied. |
| "startTime": "A String", # Exclude files or rows older than this value. |
| "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out |
| # a valid start_time to avoid scanning files that have not been modified |
| # since the last time the JobTrigger executed. This will be based on the |
| # time of the execution of the last run of the JobTrigger. |
| }, |
| "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. |
| # bucket. |
| "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger |
| # than this value then the rest of the bytes are omitted. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "sampleMethod": "A String", |
| "fileSet": { # Set of files to scan. # The set of one or more files to scan. |
| "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format |
| # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. |
| # |
| # If the url ends in a trailing slash, the bucket or directory represented |
| # by the url will be scanned non-recursively (content in sub-directories |
| # will not be scanned). This means that `gs://mybucket/` is equivalent to |
| # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to |
| # `gs://mybucket/directory/*`. |
| # |
| # Exactly one of `url` or `regex_file_set` must be set. |
| "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or |
| # `regex_file_set` must be set. |
| # expressions are used to allow fine-grained control over which files in the |
| # bucket to include. |
| # |
| # Included files are those that match at least one item in `include_regex` and |
| # do not match any items in `exclude_regex`. Note that a file that matches |
| # items from both lists will _not_ be included. For a match to occur, the |
| # entire file path (i.e., everything in the url after the bucket name) must |
| # match the regular expression. |
| # |
| # For example, given the input `{bucket_name: "mybucket", include_regex: |
| # ["directory1/.*"], exclude_regex: |
| # ["directory1/excluded.*"]}`: |
| # |
| # * `gs://mybucket/directory1/myfile` will be included |
| # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches |
| # across `/`) |
| # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the |
| # full path doesn't match any items in `include_regex`) |
| # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path |
| # matches an item in `exclude_regex`) |
| # |
| # If `include_regex` is left empty, it will match all files by default |
| # (this is equivalent to setting `include_regex: [".*"]`). |
| # |
| # Some other common use cases: |
| # |
| # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all |
| # files in `mybucket` except for .pdf files |
| # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will |
| # include all files directly under `gs://mybucket/directory/`, without matching |
| # across `/` |
| "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # excluded from the scan. |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| "bucketName": "A String", # The name of a Cloud Storage bucket. Required. |
| "includeRegex": [ # A list of regular expressions matching file paths to include. All files in |
| # the bucket that match at least one of these regular expressions will be |
| # included in the set of files, except for those that also match an item in |
| # `exclude_regex`. Leaving this field empty will match all files by default |
| # (this is equivalent to including `.*` in the list). |
| # |
| # Regular expressions use RE2 |
| # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found |
| # under the google/re2 repository on GitHub. |
| "A String", |
| ], |
| }, |
| }, |
| "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The |
| # number of bytes scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one |
| # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. |
| "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. |
| # Number of files scanned is rounded down. Must be between 0 and 100, |
| # inclusively. Both 0 and 100 means no limit. Defaults to 0. |
| "fileTypes": [ # List of file type groups to include in the scan. |
| # If empty, all files are scanned and available data format processors |
| # are applied. In addition, the binary content of the selected files |
| # is always scanned as well. |
| "A String", |
| ], |
| }, |
| }, |
| "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. |
| # When used with redactContent only info_types and min_likelihood are currently |
| # used. |
| "excludeInfoTypes": True or False, # When true, excludes type information of the findings. |
| "limits": { |
| "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. |
| # When set within `InspectContentRequest`, the maximum returned is 2000 |
| # regardless if this is set higher. |
| "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. |
| { # Max findings configuration per infoType, per content item or long |
| # running DlpJob. |
| "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per |
| # info_type should be provided. If InfoTypeLimit does not have an |
| # info_type, the DLP API applies the limit against all info_types that |
| # are found but not specified in another InfoTypeLimit. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "maxFindings": 42, # Max findings limit for the given infoType. |
| }, |
| ], |
| "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. |
| # When set within `InspectDataSourceRequest`, |
| # the maximum returned is 2000 regardless if this is set higher. |
| # When set within `InspectContentRequest`, this field is ignored. |
| }, |
| "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is |
| # POSSIBLE. |
| # See https://cloud.google.com/dlp/docs/likelihood to learn more. |
| "customInfoTypes": [ # CustomInfoTypes provided by the user. See |
| # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. |
| { # Custom information type provided by the user. Used to find domain-specific |
| # sensitive information configurable to the data in question. |
| "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that |
| # support reversing. |
| # such as |
| # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). |
| # These types of transformations are |
| # those that perform pseudonymization, thereby producing a "surrogate" as |
| # output. This should be used in conjunction with a field on the |
| # transformation such as `surrogate_info_type`. This CustomInfoType does |
| # not support the use of `detection_rules`. |
| }, |
| "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in |
| # infoType, when the name matches one of existing infoTypes and that infoType |
| # is specified in `InspectContent.info_types` field. Specifying the latter |
| # adds findings to the one detected by the system. If built-in info type is |
| # not specified in `InspectContent.info_types` list then the name is treated |
| # as a custom info type. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in |
| # `InspectDataSource`. Not currently supported in `InspectContent`. |
| "name": "A String", # Resource name of the requested `StoredInfoType`, for example |
| # `organizations/433245324/storedInfoTypes/432452342` or |
| # `projects/project-id/storedInfoTypes/432452342`. |
| "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for |
| # inspection was created. Output-only field, populated by the system. |
| }, |
| "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. |
| # Rules are applied in order that they are specified. Not supported for the |
| # `surrogate_type` CustomInfoType. |
| { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a |
| # `CustomInfoType` to alter behavior under certain circumstances, depending |
| # on the specific details of the rule. Not supported for the `surrogate_type` |
| # custom infoType. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| }, |
| ], |
| "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding |
| # to be returned. It still can be used for rules matching. |
| "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be |
| # altered by a detection rule if the finding meets the criteria specified by |
| # the rule. Defaults to `VERY_LIKELY` if not specified. |
| }, |
| ], |
| "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is |
| # included in the response; see Finding.quote. |
| "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. |
| # Exclusion rules, contained in the set are executed in the end, other |
| # rules are executed in the order they are specified for each info type. |
| { # Rule set for modifying a set of infoTypes to alter behavior under certain |
| # circumstances, depending on the specific details of the rules within the set. |
| "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. |
| { # A single inspection rule to be applied to infoTypes, specified in |
| # `InspectionRuleSet`. |
| "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. |
| # proximity of hotwords. |
| "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. |
| # The total length of the window cannot exceed 1000 characters. Note that |
| # the finding itself will be included in the window, so that hotwords may |
| # be used to match substrings of the finding itself. For example, the |
| # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be |
| # adjusted upwards if the area code is known to be the local area code of |
| # a company office using the hotword regex "\(xxx\)", where "xxx" |
| # is the area code in question. |
| # rule. |
| "windowAfter": 42, # Number of characters after the finding to consider. |
| "windowBefore": 42, # Number of characters before the finding to consider. |
| }, |
| "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. |
| # part of a detection rule. |
| "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of |
| # levels. For example, if a finding would be `POSSIBLE` without the |
| # detection rule and `relative_likelihood` is 1, then it is upgraded to |
| # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. |
| # Likelihood may never drop below `VERY_UNLIKELY` or exceed |
| # `VERY_LIKELY`, so applying an adjustment of 1 followed by an |
| # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in |
| # a final likelihood of `LIKELY`. |
| "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. |
| }, |
| }, |
| "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. |
| # `InspectionRuleSet` are removed from results. |
| "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. |
| "pattern": "A String", # Pattern defining the regular expression. Its syntax |
| # (https://github.com/google/re2/wiki/Syntax) can be found under the |
| # google/re2 repository on GitHub. |
| "groupIndexes": [ # The index of the submatch to extract as findings. When not |
| # specified, the entire match is returned. No more than 3 may be included. |
| 42, |
| ], |
| }, |
| "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. |
| "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or |
| # contained within with a finding of an infoType from this list. For |
| # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and |
| # `exclusion_rule` containing `exclude_info_types.info_types` with |
| # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap |
| # with EMAIL_ADDRESS finding. |
| # That leads to "[email protected]" to generate only a single |
| # finding, namely email address. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. |
| # be used to match sensitive information specific to the data, such as a list |
| # of employee IDs or job titles. |
| # |
| # Dictionary words are case-insensitive and all characters other than letters |
| # and digits in the unicode [Basic Multilingual |
| # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) |
| # will be replaced with whitespace when scanning for matches, so the |
| # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", |
| # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters |
| # surrounding any match must be of a different type than the adjacent |
| # characters within the word, so letters must be next to non-letters and |
| # digits next to non-digits. For example, the dictionary word "jen" will |
| # match the first three letters of the text "jen123" but will return no |
| # matches for "jennifer". |
| # |
| # Dictionary words containing a large number of characters that are not |
| # letters or digits may result in unexpected findings because such characters |
| # are treated as whitespace. The |
| # [limits](https://cloud.google.com/dlp/limits) page contains details about |
| # the size limits of dictionaries. For dictionaries that do not fit within |
| # these constraints, consider using `LargeCustomDictionaryConfig` in the |
| # `StoredInfoType` API. |
| "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. |
| "words": [ # Words or phrases defining the dictionary. The dictionary must contain |
| # at least one phrase and every phrase must contain at least 2 characters |
| # that are letters or digits. [required] |
| "A String", |
| ], |
| }, |
| "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file |
| # is accepted. |
| "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. |
| # Example: gs://[BUCKET_NAME]/dictionary.txt |
| }, |
| }, |
| "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. |
| }, |
| }, |
| ], |
| "infoTypes": [ # List of infoTypes this rule set is applied to. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| ], |
| "contentOptions": [ # List of options defining data content to scan. |
| # If empty, text, images, and other content will be included. |
| "A String", |
| ], |
| "infoTypes": [ # Restricts what info_types to look for. The values must correspond to |
| # InfoType values returned by ListInfoTypes or listed at |
| # https://cloud.google.com/dlp/docs/infotypes-reference. |
| # |
| # When no InfoTypes or CustomInfoTypes are specified in a request, the |
| # system may automatically choose what detectors to run. By default this may |
| # be all types, but may change over time as detectors are updated. |
| # |
| # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, |
| # but may change over time as new InfoTypes are added. If you need precise |
| # control and predictability as to what detectors are run you should specify |
| # specific InfoTypes listed in the reference. |
| { # Type of information detected by the API. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| ], |
| }, |
| "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. |
| # `inspect_config` will be merged into the values persisted as part of the |
| # template. |
| "actions": [ # Actions to execute at the completion of the job. |
| { # A task to execute on the completion of a job. |
| # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. |
| "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. |
| # OutputStorageConfig. Only a single instance of this action can be |
| # specified. |
| # Compatible with: Inspect, Risk |
| "outputConfig": { # Cloud repository for storing output. |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing |
| # dataset. If table_id is not set a new one will be generated |
| # for you with the following format: |
| # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for |
| # generating the date details. |
| # |
| # For Inspect, each column in an existing output table must have the same |
| # name, type, and mode of a field in the `Finding` object. |
| # |
| # For Risk, an existing output table should be the output of a previous |
| # Risk analysis job run on the same source table, with the same privacy |
| # metric and quasi-identifiers. Risk jobs that analyze the same table but |
| # compute a different privacy metric, or use different sets of |
| # quasi-identifiers, cannot store their results in the same table. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only |
| # used for Inspect and must be unspecified for Risk jobs. Columns are derived |
| # from the `Finding` object. If appending to an existing table, any columns |
| # from the predefined schema that are missing will be added. No columns in |
| # the existing table will be deleted. |
| # |
| # If unspecified, then all available columns will be used for a new table or |
| # an (existing) table with no schema, and no changes will be made to an |
| # existing table that has a schema. |
| }, |
| }, |
| "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's |
| # completion/failure. |
| # completion/failure. |
| }, |
| "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). |
| # Command Center (CSCC Alpha). |
| # This action is only available for projects which are parts of |
| # an organization and whitelisted for the alpha Cloud Security Command |
| # Center. |
| # The action will publish count of finding instances and their info types. |
| # The summary of findings will be persisted in CSCC and are governed by CSCC |
| # service-specific policy, see https://cloud.google.com/terms/service-terms |
| # Only a single instance of this action can be specified. |
| # Compatible with: Inspect |
| }, |
| "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. |
| # message contains a single field, `DlpJobName`, which is equal to the |
| # finished job's |
| # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). |
| # Compatible with: Inspect, Risk |
| "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given |
| # publishing access rights to the DLP API service account executing |
| # the long running DlpJob sending the notifications. |
| # Format is projects/{project}/topics/{topic}. |
| }, |
| }, |
| ], |
| }, |
| }, |
| "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job. |
| "infoTypeStats": [ # Statistics of how many instances of each info type were found during |
| # inspect job. |
| { # Statistics regarding a specific InfoType. |
| "count": "A String", # Number of findings for this infoType. |
| "infoType": { # Type of information detected by the API. # The type of finding this stat is for. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| }, |
| ], |
| "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process. |
| "processedBytes": "A String", # Total size in bytes that were processed. |
| }, |
| }, |
| "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source. |
| "numericalStatsResult": { # Result of the numerical stats computation. |
| "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal |
| # sized buckets. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an |
| # estimation, not exact values. |
| "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value |
| # doesn't correspond to any such interval, the associated frequency is |
| # zero. For example, the following records: |
| # {min_anonymity: 1, max_anonymity: 1, frequency: 17} |
| # {min_anonymity: 2, max_anonymity: 3, frequency: 42} |
| # {min_anonymity: 5, max_anonymity: 10, frequency: 99} |
| # mean that there are no record with an estimated anonymity of 4, 5, or |
| # larger than 10. |
| { # A KMapEstimationHistogramBucket message with the following values: |
| # min_anonymity: 3 |
| # max_anonymity: 5 |
| # frequency: 42 |
| # means that there are 42 records whose quasi-identifier values correspond |
| # to 3, 4 or 5 people in the overlying population. An important particular |
| # case is when min_anonymity = max_anonymity = 1: the frequency field then |
| # corresponds to the number of uniquely identifiable records. |
| "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total |
| # number of classes returned per bucket is capped at 20. |
| { # A tuple of values for the quasi-identifier columns. |
| "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values. |
| "quasiIdsValues": [ # The quasi-identifier values. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| }, |
| ], |
| "minAnonymity": "A String", # Always positive. |
| "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. |
| "maxAnonymity": "A String", # Always greater than or equal to min_anonymity. |
| "bucketSize": "A String", # Number of records within these anonymity bounds. |
| }, |
| ], |
| }, |
| "kAnonymityResult": { # Result of the k-anonymity computation. |
| "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes. |
| { |
| "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of |
| # classes returned per bucket is capped at 20. |
| { # The set of columns' values that share the same ldiversity value |
| "quasiIdsValues": [ # Set of values defining the equivalence class. One value per |
| # quasi-identifier column in the original KAnonymity metric message. |
| # The order is always the same as the original request. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the |
| # above set of values. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. |
| "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket. |
| "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket. |
| "bucketSize": "A String", # Total number of equivalence classes in this bucket. |
| }, |
| ], |
| }, |
| "lDiversityResult": { # Result of the l-diversity computation. |
| "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies. |
| { |
| "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of |
| # classes returned per bucket is capped at 20. |
| { # The set of columns' values that share the same ldiversity value. |
| "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class. |
| "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence |
| # class. The order is always the same as the original request. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "topSensitiveValues": [ # Estimated frequencies of top sensitive values. |
| { # A value of a field, including its frequency. |
| "count": "A String", # How many times the value is contained in the field. |
| "value": { # Set of primitive values supported by the system. # A value contained in the field in question. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| ], |
| "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. |
| "bucketSize": "A String", # Total number of equivalence classes in this bucket. |
| "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence |
| # classes in this bucket. |
| "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence |
| # classes in this bucket. |
| }, |
| ], |
| }, |
| "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. |
| "numericalStatsConfig": { # Compute numerical stats over an individual column, including |
| # min, max, and quantiles. |
| "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are |
| # integer, float, date, datetime, timestamp, time. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what |
| # is called "journalist risk" in the literature, except the attack dataset is |
| # statistically modeled instead of being perfectly known. This can be done |
| # using publicly available data (like the US Census), or using a custom |
| # statistical model (indicated as one or several BigQuery tables), or by |
| # extrapolating from the distribution of values in the input dataset. |
| # A column with a semantic tag attached. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the |
| # same tag. [required] |
| { |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers column must appear in exactly one column |
| # of one auxiliary table. |
| { # An auxiliary table contains statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. |
| "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are |
| # defined for the l-diversity computation. When multiple fields are |
| # specified, they are considered a single composite key. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to |
| # figure out that one given individual appears in a de-identified dataset. |
| # Similarly to the k-map metric, we cannot compute δ-presence exactly without |
| # knowing the attack dataset, so we use a statistical model instead. |
| "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. |
| # Required if no column is tagged with a region-specific InfoType (like |
| # US_ZIP_5) or a region code. |
| "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the |
| # same tag. [required] |
| { # A column with a semantic tag attached. |
| "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must |
| # indicate an auxiliary table that contains statistical information on |
| # the possible values of this column (below). |
| "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public |
| # dataset as a statistical model of population, if available. We |
| # currently support US ZIP codes, region codes, ages and genders. |
| # To programmatically obtain the list of supported InfoTypes, use |
| # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. |
| "name": "A String", # Name of the information type. Either a name of your choosing when |
| # creating a CustomInfoType, or one of the names listed |
| # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying |
| # a built-in type. InfoType names should conform to the pattern |
| # [a-zA-Z0-9_]{1,64}. |
| }, |
| "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from |
| # the distribution of values in the input data |
| # empty messages in your APIs. A typical example is to use it as the request |
| # or the response type of an API method. For instance: |
| # |
| # service Foo { |
| # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); |
| # } |
| # |
| # The JSON representation for `Empty` is empty JSON object `{}`. |
| }, |
| }, |
| ], |
| "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag |
| # used to tag a quasi-identifiers field must appear in exactly one |
| # field of one auxiliary table. |
| { # An auxiliary table containing statistical information on the relative |
| # frequency of different quasi-identifiers values. It has one or several |
| # quasi-identifiers columns, and one column that indicates the relative |
| # frequency of each quasi-identifier tuple. |
| # If a tuple is present in the data but not in the auxiliary table, the |
| # corresponding relative frequency is assumed to be zero (and thus, the |
| # tuple is highly reidentifiable). |
| "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number |
| # between 0 and 1 (inclusive). Null values are assumed to be zero. |
| # [required] |
| "name": "A String", # Name describing the field. |
| }, |
| "quasiIds": [ # Quasi-identifier columns. [required] |
| { # A quasi-identifier column has a custom_tag, used to know which column |
| # in the data corresponds to which column in the statistical model. |
| "field": { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| "customTag": "A String", |
| }, |
| ], |
| "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| ], |
| }, |
| "categoricalStatsConfig": { # Compute numerical stats over an individual column, including |
| # number of distinct values and value count distribution. |
| "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are |
| # supported except for arrays and structs. However, it may be more |
| # informative to use NumericalStats when the field type is supported, |
| # depending on the data. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. |
| "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a |
| # single individual. If the same entity_id is associated to multiple |
| # quasi-identifier tuples over distinct rows, we consider the entire |
| # collection of tuples as the composite quasi-identifier. This collection |
| # is a multiset: the order in which the different tuples appear in the |
| # dataset is ignored, but their frequency is taken into account. |
| # |
| # Important note: a maximum of 1000 rows can be associated to a single |
| # entity ID. If more rows are associated with the same entity ID, some |
| # might be ignored. |
| # single person. For example, in medical records the `EntityId` might be a |
| # patient identifier, or for financial records it might be an account |
| # identifier. This message is used when generalizations or analysis must take |
| # into account that multiple rows correspond to the same entity. |
| "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. |
| "name": "A String", # Name describing the field. |
| }, |
| }, |
| "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are |
| # specified, they are considered a single composite key. Structs and |
| # repeated data types are not supported; however, nested fields are |
| # supported so long as they are not structs themselves or nested within |
| # a repeated field. |
| { # General identifier of a data field in a storage service. |
| "name": "A String", # Name describing the field. |
| }, |
| ], |
| }, |
| }, |
| "categoricalStatsResult": { # Result of the categorical stats computation. |
| "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column. |
| { |
| "bucketValues": [ # Sample of value frequencies in this bucket. The total number of |
| # values returned per bucket is capped at 20. |
| { # A value of a field, including its frequency. |
| "count": "A String", # How many times the value is contained in the field. |
| "value": { # Set of primitive values supported by the system. # A value contained in the field in question. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct values in this bucket. |
| "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket. |
| "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket. |
| "bucketSize": "A String", # Total number of values in this bucket. |
| }, |
| ], |
| }, |
| "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an |
| # estimation, not exact values. |
| "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a |
| # value doesn't correspond to any such interval, the associated frequency |
| # is zero. For example, the following records: |
| # {min_probability: 0, max_probability: 0.1, frequency: 17} |
| # {min_probability: 0.2, max_probability: 0.3, frequency: 42} |
| # {min_probability: 0.3, max_probability: 0.4, frequency: 99} |
| # mean that there are no record with an estimated probability in [0.1, 0.2) |
| # nor larger or equal to 0.4. |
| { # A DeltaPresenceEstimationHistogramBucket message with the following |
| # values: |
| # min_probability: 0.1 |
| # max_probability: 0.2 |
| # frequency: 42 |
| # means that there are 42 records for which δ is in [0.1, 0.2). An |
| # important particular case is when min_probability = max_probability = 1: |
| # then, every individual who shares this quasi-identifier combination is in |
| # the dataset. |
| "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total |
| # number of classes returned per bucket is capped at 20. |
| { # A tuple of values for the quasi-identifier columns. |
| "quasiIdsValues": [ # The quasi-identifier values. |
| { # Set of primitive values supported by the system. |
| # Note that for the purposes of inspection or transformation, the number |
| # of bytes considered to comprise a 'Value' is based on its representation |
| # as a UTF-8 encoded string. For example, if 'integer_value' is set to |
| # 123456789, the number of bytes would be counted as 9, even though an |
| # int64 only holds up to 8 bytes of data. |
| "floatValue": 3.14, |
| "timestampValue": "A String", |
| "dayOfWeekValue": "A String", |
| "timeValue": { # Represents a time of day. The date and time zone are either not significant |
| # or are specified elsewhere. An API may choose to allow leap seconds. Related |
| # types are google.type.Date and `google.protobuf.Timestamp`. |
| "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose |
| # to allow the value "24:00:00" for scenarios like business closing time. |
| "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. |
| "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may |
| # allow the value 60 if it allows leap-seconds. |
| "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. |
| }, |
| "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day |
| # and time zone are either specified elsewhere or are not significant. The date |
| # is relative to the Proleptic Gregorian Calendar. This can represent: |
| # |
| # * A full date, with non-zero year, month and day values |
| # * A month and day value, with a zero year, e.g. an anniversary |
| # * A year on its own, with zero month and day values |
| # * A year and month value, with a zero day, e.g. a credit card expiration date |
| # |
| # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. |
| "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without |
| # a year. |
| "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 |
| # if specifying a year by itself or a year and month where the day is not |
| # significant. |
| "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a |
| # month and day. |
| }, |
| "stringValue": "A String", |
| "booleanValue": True or False, |
| "integerValue": "A String", |
| }, |
| ], |
| "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these |
| # quasi-identifier values is in the dataset. This value, typically called |
| # δ, is the ratio between the number of records in the dataset with these |
| # quasi-identifier values, and the total number of individuals (inside |
| # *and* outside the dataset) with these quasi-identifier values. |
| # For example, if there are 15 individuals in the dataset who share the |
| # same quasi-identifier values, and an estimated 100 people in the entire |
| # population with these values, then δ is 0.15. |
| }, |
| ], |
| "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. |
| "bucketSize": "A String", # Number of records within these probability bounds. |
| "maxProbability": 3.14, # Always greater than or equal to min_probability. |
| "minProbability": 3.14, # Between 0 and 1. |
| }, |
| ], |
| }, |
| "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. |
| # identified by its project_id, dataset_id, and table_name. Within a query |
| # a table is often referenced with a string in the format of: |
| # `<project_id>:<dataset_id>.<table_id>` or |
| # `<project_id>.<dataset_id>.<table_id>`. |
| "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. |
| # If omitted, project ID is inferred from the API call. |
| "tableId": "A String", # Name of the table. |
| "datasetId": "A String", # Dataset ID of the table. |
| }, |
| }, |
| "state": "A String", # State of a job. |
| "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that |
| # instantiated the job. |
| "startTime": "A String", # Time when the job started. |
| "endTime": "A String", # Time when the job finished. |
| "type": "A String", # The type of job. |
| "createTime": "A String", # Time when the job was created. |
| }, |
| ], |
| }</pre> |
| </div> |
| |
| <div class="method"> |
| <code class="details" id="list_next">list_next(previous_request, previous_response)</code> |
| <pre>Retrieves the next page of results. |
| |
| Args: |
| previous_request: The request for the previous page. (required) |
| previous_response: The response from the request for the previous page. (required) |
| |
| Returns: |
| A request object that you can call 'execute()' on to request the next |
| page. Returns None if there are no more items in the collection. |
| </pre> |
| </div> |
| |
| </body></html> |