Skip to main content

A simple query client for pcd label dataset

Project description

Query Client

This is a simple client to query.

example

初始化客户端及元数据

from query_client import QueryClient

qc = QueryClient()

tables = qc.get_meta()
# print(tables) 获取全部可查表,及其coloumns, 元数据
## 目前只有两个相关表注册
can_table = tables["can_detail"]
pcd_table = tables["pcd_label_desc"]

fetch

python
from sqlalchemy.sql.expression import select, text, and_
from query_client import QueryClient
import pandas as pd

qc = QueryClient()
tables = qc.get_meta()
mytable = tables["pcd_label_desc"]

# 1. fetch_by_stmt 初始化 statement, select/where/group_by
stmt = select(mytable.c.dataset, mytable.c.batch_path, text("array_distinct(categories) scenes"), text("cardinality(filter(categories, x -> x IN ('car'))) as target_cnt"))
r = qc.fetch_by_stmt(stmt)
df = pd.DataFrame(r)

# 2. fetch_by_sql
query = "select * from pcd_label_desc limit 10"
r = qc.fetch_by_sql(query)
df = pd.DataFrame(r)

具体示例

  1. 城区场景的筛选
    • 定义单帧指标(各种目标数)
    • group by 序列
  2. fetch_by_sql 实现方式
    • cte 语法:
      • t 表查询单帧的去重场景、不同类型的目标数指标,dataset 及batch 维度
      • 利用 t 表的dataset, batch_path 维度聚合指标
    • code example
      query = """
      WITH t AS (
          SELECT
              dataset,
              batch_path,
              array_distinct(categories) scenes,
              cardinality(
                  filter(
                      categories,
                      x -> x IN (
                          'car',
                          'van',
                          'truck',
                          'mini_truck',
                          'special_truck',
                          'truck',
                          'cyclist',
                          'bicycle',
                          'pedestrian'
                      )
                  )
              ) AS target_cnt,
              cardinality(
                  filter(
                      categories,
                      x -> x IN (
                          'truck',
                          'mini_truck',
                          'special_truck'
                      )
                  )
              ) truck_cnt,
              cardinality(filter(categories, x -> x IN ('car', 'van'))) carvan_cnt,
              cardinality(filter(categories, x -> x = 'pedestrian')) pedestrian_cnt,
              cardinality(
                  filter(categories, x -> x IN ('cyclist', 'bicycle'))
              ) cyclist_cnt
          FROM
              pcd_label_desc
          WHERE
              delivery_date = '20230310'
              AND any_match(
                  road_condition,
                  x -> x = 'highway_road'
              )
      )
      SELECT
          dataset,
          batch_path,
          array_distinct(FLATTEN(ARRAY_AGG(scenes))) scenes,
          SUM(target_cnt) target_cnt,
          SUM(carvan_cnt) carvan_cnt,
          SUM(pedestrian_cnt) pedestrian_cnt,
          SUM(truck_cnt) truck_cnt,
          SUM(cyclist_cnt) cyclist_cnt
      FROM
          t
      GROUP BY
          1,
          2
      ORDER BY
          target_cnt DESC
      """
      
      r = qc.fetch_by_sql(query)
      df = pd.DataFrame(r)
      
  3. fetch_by_stmt 实现方式
    mytable = tables["pcd_label_desc"]
    
    stmt = select(
            mytable.c.dataset, 
            mytable.c.batch_path, # 指定table聚合的维度
    	text("cardinality(filter(categories, x -> x IN ('car', 'van'))) carvan_cnt"), # 定义函数变换指标
            # ... goon
            )
            .where(and_(
                mytable.c.delivery_date == "20230310",
                text("any_match(road_condition, x -> x = 'city_road')") # 过滤城区场景
    	))
    
    # 指标cte 表
    cte = qc.with_cte(stmt)
    
    # 继续聚合
    stmt2 = select(
        cte.c.dataset, 
        cte.c.batch_path,
        text("SUM(carvan_cnt) carvan_cnt")
        ).group_by(cte.c.dataset, cte.c.batch_path)
    
    # 查询结果
    r = qc.fetch_by_stmt(stmt2)
    df = pd.DataFrame(r)
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

query_client-0.0.1.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

query_client-0.0.1-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file query_client-0.0.1.tar.gz.

File metadata

  • Download URL: query_client-0.0.1.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for query_client-0.0.1.tar.gz
Algorithm Hash digest
SHA256 6ce95267438135c0b9beb432d6edf9048cd438bfc7fc43593a4e5101f8e1976e
MD5 919b5b5961ed34a1e4cb4dc33e2662a3
BLAKE2b-256 649a67799e79775e66eae44842696ca8c76e79004df258938ed36e47df7203b3

See more details on using hashes here.

File details

Details for the file query_client-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: query_client-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for query_client-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 40fba790a25f607f771e106b4cd8217514bc5b437c1335d2ad454c35cef04598
MD5 513895aa32d601ca6a0045513375d61f
BLAKE2b-256 b95c96233baf16fb013d409525c9f2a422d283dc80f614f5bac17a667226e3d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page