Snowflake Snowpark for Python
Project description
Snowflake Snowpark Python and Snowpark pandas APIs
The Snowpark library provides intuitive APIs for querying and processing data in a data pipeline. Using this library, you can build applications that process data in Snowflake without having to move data to the system where your application code runs.
Source code | Snowpark Python developer guide | Snowpark Python API reference | Snowpark pandas developer guide | Snowpark pandas API reference | Product documentation | Samples
Getting started
Have your Snowflake account ready
If you don't have a Snowflake account yet, you can sign up for a 30-day free trial account.
Create a Python virtual environment
You can use miniconda, anaconda, or virtualenv to create a Python 3.9, 3.10, 3.11, 3.12 or 3.13 virtual environment.
For Snowpark pandas, only Python 3.9, 3.10, or 3.11 is supported.
To have the best experience when using it with UDFs, creating a local conda environment with the Snowflake channel is recommended.
Install the library to the Python virtual environment
pip install snowflake-snowpark-python
To use the Snowpark pandas API, you can optionally install the following, which installs modin in the same environment. The Snowpark pandas API provides a familiar interface for pandas users to query and process data directly in Snowflake.
pip install "snowflake-snowpark-python[modin]"
Create a session and use the Snowpark Python API
from snowflake.snowpark import Session
connection_parameters = {
  "account": "<your snowflake account>",
  "user": "<your snowflake user>",
  "password": "<your snowflake password>",
  "role": "<snowflake user role>",
  "warehouse": "<snowflake warehouse>",
  "database": "<snowflake database>",
  "schema": "<snowflake schema>"
}
session = Session.builder.configs(connection_parameters).create()
# Create a Snowpark dataframe from input data
df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"]) 
df = df.filter(df.a > 1)
result = df.collect()
df.show()
# -------------
# |"A"  |"B"  |
# -------------
# |3    |4    |
# -------------
Create a session and use the Snowpark pandas API
import modin.pandas as pd
import snowflake.snowpark.modin.plugin
from snowflake.snowpark import Session
CONNECTION_PARAMETERS = {
    'account': '<myaccount>',
    'user': '<myuser>',
    'password': '<mypassword>',
    'role': '<myrole>',
    'database': '<mydatabase>',
    'schema': '<myschema>',
    'warehouse': '<mywarehouse>',
}
session = Session.builder.configs(CONNECTION_PARAMETERS).create()
# Create a Snowpark pandas dataframe from input data
df = pd.DataFrame([['a', 2.0, 1],['b', 4.0, 2],['c', 6.0, None]], columns=["COL_STR", "COL_FLOAT", "COL_INT"])
df
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0      1.0
# 1       b        4.0      2.0
# 2       c        6.0      NaN
df.shape
# (3, 3)
df.head(2)
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0        1
# 1       b        4.0        2
df.dropna(subset=["COL_INT"], inplace=True)
df
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0        1
# 1       b        4.0        2
df.shape
# (2, 3)
df.head(2)
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0        1
# 1       b        4.0        2
# Save the result back to Snowflake with a row_pos column.
df.reset_index(drop=True).to_snowflake('pandas_test2', index=True, index_label=['row_pos'])
Samples
The Snowpark Python developer guide, Snowpark Python API references, Snowpark pandas developer guide, and Snowpark pandas api references have basic sample code. Snowflake-Labs has more curated demos.
Logging
Configure logging level for snowflake.snowpark for Snowpark Python API logs.
Snowpark uses the Snowflake Python Connector.
So you may also want to configure the logging level for snowflake.connector when the error is in the Python Connector.
For instance,
import logging
for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
Reading and writing to pandas DataFrame
Snowpark Python API supports reading from and writing to a pandas DataFrame via the to_pandas and write_pandas commands.
To use these operations, ensure that pandas is installed in the same environment. You can install pandas alongside Snowpark Python by executing the following command:
pip install "snowflake-snowpark-python[pandas]"
Once pandas is installed, you can convert between a Snowpark DataFrame and pandas DataFrame as follows:
df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
# Convert Snowpark DataFrame to pandas DataFrame
pandas_df = df.to_pandas() 
# Write pandas DataFrame to a Snowflake table and return Snowpark DataFrame
snowpark_df = session.write_pandas(pandas_df, "new_table", auto_create_table=True)
Snowpark pandas API also supports writing to pandas:
import modin.pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns=["a", "b"])
# Convert Snowpark pandas DataFrame to pandas DataFrame
pandas_df = df.to_pandas() 
Note that the above Snowpark pandas commands will work if Snowpark is installed with the [modin] option, the additional [pandas] installation is not required.
Verifying Package Signatures
To ensure the authenticity and integrity of the Python package, follow the steps below to verify the package signature using cosign.
Steps to verify the signature:
- Install cosign:
- This example is using golang installation: installing-cosign-with-go
 
- Download the file from the repository like pypi:
- Download the signature files from the release tag, replace the version number with the version you are verifying:
- Verify signature:
# replace the version number with the version you are verifying ./cosign verify-blob snowflake_snowpark_python-1.22.1-py3-none-any.whl \ --certificate snowflake_snowpark_python-1.22.1-py3-none-any.whl.crt \ --certificate-identity https://github.com/snowflakedb/snowpark-python/.github/workflows/python-publish.yml@refs/tags/v1.22.1 \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --signature snowflake_snowpark_python-1.22.1-py3-none-any.whl.sig Verified OK 
Contributing
Please refer to CONTRIBUTING.md.
Release History
1.42.0 (2025-10-28)
Snowpark Python API Updates
New Features
- Snowpark python DB-api is now generally available. Access this feature with DataFrameReader.dbapi()to read data from a database table or query into a DataFrame using a DBAPI connection.
1.41.0 (2025-10-23)
Snowpark Python API Updates
New Features
- Added a new function serviceinsnowflake.snowpark.functionsthat allows users to create a callable representing a Snowpark Container Services (SPCS) service.
- Added connection_parametersparameter toDataFrameReader.dbapi()(PuPr) method to allow passing keyword arguments to thecreate_connectioncallable.
- Added support for Session.begin_transaction,Session.commitandSession.rollback.
- Added support for the following functions in functions.py:- Geospatial functions:
- st_interpolate
- st_intersection
- st_intersection_agg
- st_intersects
- st_isvalid
- st_length
- st_makegeompoint
- st_makeline
- st_makepolygon
- st_makepolygonoriented
- st_disjoint
- st_distance
- st_dwithin
- st_endpoint
- st_envelope
- st_geohash
- st_geomfromgeohash
- st_geompointfromgeohash
- st_hausdorffdistance
- st_makepoint
- st_npoints
- st_perimeter
- st_pointn
- st_setsrid
- st_simplify
- st_srid
- st_startpoint
- st_symdifference
- st_transform
- st_union
- st_union_agg
- st_within
- st_x
- st_xmax
- st_xmin
- st_y
- st_ymax
- st_ymin
- st_geogfromgeohash
- st_geogpointfromgeohash
- st_geographyfromwkb
- st_geographyfromwkt
- st_geometryfromwkb
- st_geometryfromwkt
- try_to_geography
- try_to_geometry
 
 
- Geospatial functions:
- Added a parameter to enable and disable automatic column name aliasing for interval_day_time_from_partsandinterval_year_month_from_partsfunctions.
Bug Fixes
- Fixed a bug that DataFrameReader.xmlfails to parse XML files with undeclared namespaces whenignoreNamespaceisTrue.
- Added a fix for floating point precision discrepancies in interval_day_time_from_parts.
- Fixed a bug where writing Snowpark pandas dataframes on the pandas backend with a column multiindex to Snowflake with to_snowflakewould raiseKeyError.
- Fixed a bug that DataFrameReader.dbapi(PuPr) is not compatible with oracledb 3.4.0.
- Fixed a bug where modinwould unintentionally be imported during session initialization in some scenarios.
- Fixed a bug where session.udf|udtf|udaf|sproc.registerfailed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.
Improvements
- The default maximum length for inferred StringType columns during schema inference in DataFrameReader.dbapiis now increased from 16MB to 128MB in parquet file based ingestion.
Dependency Updates
- Updated dependency of snowflake-connector-python>=3.17,<5.0.0.
Snowpark pandas API Updates
New Features
- Added support for the dtypesparameter ofpd.get_dummies
- Added support for nuniqueindf.pivot_table,df.aggand other places where aggregate functions can be used.
- Added support for DataFrame.interpolateandSeries.interpolatewith the "linear", "ffill"/"pad", and "backfill"/bfill" methods. These use the SQLINTERPOLATE_LINEAR,INTERPOLATE_FFILL, andINTERPOLATE_BFILLfunctions (PuPr).
Improvements
- Improved performance of Series.to_snowflakeandpd.to_snowflake(series)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes.
- Enhanced autoswitching functionality from Snowflake to native Pandas for methods with unsupported argument combinations:
- get_dummies()with- dummy_na=True,- drop_first=True, or custom- dtypeparameters
- cumsum(),- cummin(),- cummax()with- axis=1(column-wise operations)
- skew()with- axis=1or- numeric_only=Falseparameters
- round()with- decimalsparameter as a Series
- corr()with- method!=pearsonparameter
 
- Set cte_optimization_enabledto True for all Snowpark pandas sessions.
- Add support for the following in faster pandas:
- isin
- isna
- isnull
- notna
- notnull
- str.contains
- str.startswith
- str.endswith
- str.slice
- dt.date
- dt.time
- dt.hour
- dt.minute
- dt.second
- dt.microsecond
- dt.nanosecond
- dt.year
- dt.month
- dt.day
- dt.quarter
- dt.is_month_start
- dt.is_month_end
- dt.is_quarter_start
- dt.is_quarter_end
- dt.is_year_start
- dt.is_year_end
- dt.is_leap_year
- dt.days_in_month
- dt.daysinmonth
- sort_values
- loc(setting columns)
- to_datetime
- rename
- drop
- invert
- duplicated
- iloc
- head
- columns(e.g., df.columns = ["A", "B"])
- agg
- min
- max
- count
- sum
- mean
- median
- std
- var
- groupby.agg
- groupby.min
- groupby.max
- groupby.count
- groupby.sum
- groupby.mean
- groupby.median
- groupby.std
- groupby.var
- drop_duplicates
 
- Reuse row count from the relaxed query compiler in get_axis_len.
Bug Fixes
- Fixed a bug where the row count was not getting cached in the ordered dataframe each time count_rows() is called.
1.40.0 (2025-10-02)
Snowpark Python API Updates
New Features
- 
Added a new module snowflake.snowpark.secretsthat provides Python wrappers for accessing Snowflake Secrets within Python UDFs and stored procedures that execute inside Snowflake.- get_generic_secret_string
- get_oauth_access_token
- get_secret_type
- get_username_password
- get_cloud_provider_token
 
- 
Added support for the following scalar functions in functions.py:- 
Conditional expression functions: - booland
- boolnot
- boolor
- boolxor
- boolor_agg
- decode
- greatest_ignore_nulls
- least_ignore_nulls
- nullif
- nvl2
- regr_valx
 
- 
Semi-structured and structured date functions: - array_remove_at
- as_boolean
- map_delete
- map_insert
- map_pick
- map_size
 
- 
String & binary functions: - chr
- hex_decode_binary
 
- 
Numeric functions: - div0null
 
- 
Differential privacy functions: - dp_interval_high
- dp_interval_low
 
- 
Context functions: - last_query_id
- last_transaction
 
- 
Geospatial functions: - h3_cell_to_boundary
- h3_cell_to_children
- h3_cell_to_children_string
- h3_cell_to_parent
- h3_cell_to_point
- h3_compact_cells
- h3_compact_cells_strings
- h3_coverage
- h3_coverage_strings
- h3_get_resolution
- h3_grid_disk
- h3_grid_distance
- h3_int_to_string
- h3_polygon_to_cells
- h3_polygon_to_cells_strings
- h3_string_to_int
- h3_try_grid_path
- h3_try_polygon_to_cells
- h3_try_polygon_to_cells_strings
- h3_uncompact_cells
- h3_uncompact_cells_strings
- haversine
- h3_grid_path
- h3_is_pentagon
- h3_is_valid_cell
- h3_latlng_to_cell
- h3_latlng_to_cell_string
- h3_point_to_cell
- h3_point_to_cell_string
- h3_try_coverage
- h3_try_coverage_strings
- h3_try_grid_distance
- st_area
- st_asewkb
- st_asewkt
- st_asgeojson
- st_aswkb
- st_aswkt
- st_azimuth
- st_buffer
- st_centroid
- st_collect
- st_contains
- st_coveredby
- st_covers
- st_difference
- st_dimension
 
 
- 
Bug Fixes
- Fixed a bug that DataFrame.limit()fail if there is parameter binding in the executed SQL when used in non-stored-procedure/udxf environment.
- Added an experimental fix for a bug in schema query generation that could cause invalid sql to be generated when using nested structured types.
- Fixed multiple bugs in DataFrameReader.dbapi(PuPr):- Fixed UDTF ingestion failure with pyodbcdriver caused by unprocessed row data.
- Fixed SQL Server query input failure due to incorrect select query generation.
- Fixed UDTF ingestion not preserving column nullability in the output schema.
- Fixed an issue that caused the program to hang during multithreaded Parquet based ingestion when a data fetching error occurred.
- Fixed a bug in schema parsing when custom schema strings used upper-cased data type names (NUMERIC, NUMBER, DECIMAL, VARCHAR, STRING, TEXT).
 
- Fixed UDTF ingestion failure with 
- Fixed a bug in Session.create_dataframewhere schema string parsing failed when using upper-cased data type names (e.g., NUMERIC, NUMBER, DECIMAL, VARCHAR, STRING, TEXT).
Improvements
- Improved DataFrameReader.dbapi(PuPr) that dbapi will not retry on non-retryable error such as SQL syntax error on external data source query.
- Removed unnecessary warnings about local package version mismatch when using session.read.option('rowTag', <tag_name>).xml(<stage_file_path>)orxpathfunctions.
- Improved DataFrameReader.dbapi(PuPr) reading performance by setting the defaultfetch_sizeparameter value to 100000.
- Improved error message for XSD validation failure when reading XML files using session.read.option('rowValidationXSDPath', <xsd_path>).xml(<stage_file_path>).
Snowpark pandas API Updates
Dependency Updates
- Updated the supported modinversions to >=0.36.0 and <0.38.0 (was previously >= 0.35.0 and <0.37.0).
New Features
- Added support for DataFrame.queryfor dataframes with single-level indexes.
- Added support for DataFrameGroupby.__len__andSeriesGroupBy.__len__.
Improvements
- Hybrid execution mode is now enabled by default. Certain operations on smaller data will now automatically execute in native pandas in-memory. Use from modin.config import AutoSwitchBackend; AutoSwitchBackend.disable()to turn this off and force all execution to occur in Snowflake.
- Added a session parameter pandas_hybrid_execution_enabledto enable/disable hybrid execution as an alternative to usingAutoSwitchBackend.
- Removed an unnecessary SHOW OBJECTSquery issued fromread_snowflakeunder certain conditions.
- When hybrid execution is enabled, pd.merge,pd.concat,DataFrame.merge, andDataFrame.joinmay now move arguments to backends other than those among the function arguments.
- Improved performance of DataFrame.to_snowflakeandpd.to_snowflake(dataframe)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes.
1.39.1 (2025-09-25)
Snowpark Python API Updates
Bug Fixes
- Added an experimental fix for a bug in schema query generation that could cause invalid sql to be generated when using nested structured types.
1.39.0 (2025-09-17)
Snowpark Python API Updates
New Features
- Added support for unstructured data engineering in Snowpark, powered by Snowflake AISQL and Cortex functions:
- DataFrame.ai.complete: Generate per-row LLM completions from prompts built over columns and files.
- DataFrame.ai.filter: Keep rows where an AI classifier returns TRUE for the given predicate.
- DataFrame.ai.agg: Reduce a text column into one result using a natural-language task description.
- RelationalGroupedDataFrame.ai_agg: Perform the same natural-language aggregation per group.
- DataFrame.ai.classify: Assign single or multiple labels from given categories to text or images.
- DataFrame.ai.similarity: Compute cosine-based similarity scores between two columns via embeddings.
- DataFrame.ai.sentiment: Extract overall and aspect-level sentiment from text into JSON.
- DataFrame.ai.embed: Generate VECTOR embeddings for text or images using configurable models.
- DataFrame.ai.summarize_agg: Aggregate and produce a single comprehensive summary over many rows.
- DataFrame.ai.transcribe: Transcribe audio files to text with optional timestamps and speaker labels.
- DataFrame.ai.parse_document: OCR/layout-parse documents or images into structured JSON.
- DataFrame.ai.extract: Pull structured fields from text or files using a response schema.
- DataFrame.ai.count_tokens: Estimate token usage for a given model and input text per row.
- DataFrame.ai.split_text_markdown_header: Split Markdown into hierarchical header-aware chunks.
- DataFrame.ai.split_text_recursive_character: Split text into size-bounded chunks using recursive separators.
- DataFrameReader.file: Create a DataFrame containing all files from a stage as FILE data type for downstream unstructured data processing.
 
- Added a new datatype YearMonthIntervalTypethat allows users to create intervals for datetime operations.
- Added a new function interval_year_month_from_partsthat allows users to easily createYearMonthIntervalTypewithout using SQL.
- Added a new datatype DayTimeIntervalTypethat allows users to create intervals for datetime operations.
- Added a new function interval_day_time_from_partsthat allows users to easily createDayTimeIntervalTypewithout using SQL.
- Added support for FileOperation.listto list files in a stage with metadata.
- Added support for FileOperation.removeto remove files in a stage.
- Added an option to specify copy_grantsfor the followingDataFrameAPIs:- create_or_replace_view
- create_or_replace_temp_view
- create_or_replace_dynamic_table
 
- Added a new function snowflake.snowpark.functions.vectorizedthat allows users to mark a function as vectorized UDF.
- Added support for parameter use_vectorized_scannerin functionSession.write_pandas().
- Added support for parameter session_init_statementin udtf ingestion ofDataFrameReader.jdbc(PrPr).
- Added support for the following scalar functions in functions.py:- getdate
- getvariable
- invoker_role
- invoker_share
- is_application_role_in_session
- is_database_role_in_session
- is_granted_to_invoker_role
- is_role_in_session
- localtime
- systimestamp
 
Bug Fixes
- Fixed a bug that query_timeoutdoes not work in udtf ingestion ofDataFrameReader.jdbc(PrPr).
Deprecations
- Deprecated warnings will be triggered when using snowpark-python with Python 3.9. For more details, please refer to https://docs.snowflake.com/en/developer-guide/python-runtime-support-policy.
Dependency Updates
Improvements
- Unsupported types in DataFrameReader.dbapi(PuPr) are ingested asStringTypenow.
- Improved error message to list available columns when dataframe cannot resolve given column name.
- Added a new option cacheResulttoDataFrameReader.xmlthat allows users to cache the result of the XML reader to a temporary table after callingxml. It helps improve performance when subsequent operations are performed on the same DataFrame.
Snowpark pandas API Updates
- Added support for DataFrame.eval()for dataframes with single-level indexes.
New Features
Improvements
- Downgraded to level logging.DEBUG - 1the log message saying that the SnowparkDataFramereference of an internalDataFrameReferenceobject has changed.
- Eliminate duplicate parameter check queries for casing status when retrieving the session.
- Retrieve dataframe row counts through object metadata to avoid a COUNT(*) query (performance)
- Added support for applying Snowflake Cortex function Complete.
- Introduce faster pandas: Improved performance by deferring row position computation.
- The following operations are currently supported and can benefit from the optimization: read_snowflake,repr,loc,reset_index,merge, and binary operations.
- If a lazy object (e.g., DataFrame or Series) depends on a mix of supported and unsupported operations, the optimization will not be used.
 
- The following operations are currently supported and can benefit from the optimization: 
- Updated the error message for when Snowpark pandas is referenced within apply.
- Added a session parameter dummy_row_pos_optimization_enabledto enable/disable dummy row position optimization in faster pandas.
Dependency Updates
- Updated the supported modinversions to >=0.35.0 and <0.37.0 (was previously >= 0.34.0 and <0.36.0).
Bug Fixes
- Fixed an issue with drop_duplicates where the same data source could be read multiple times in the same query but in a different order each time, resulting in missing rows in the final result. The fix ensures that the data source is read only once.
- Fixed a bug with hybrid execution mode where an AssertionErrorwas unexpectedly raised by certain indexing operations.
Snowpark Local Testing Updates
New Features
- Added support to allow patching functions.ai_complete.
1.38.0 (2025-09-04)
Snowpark Python API Updates
New Features
- Added support for the following AI-powered functions in functions.py:- ai_extract
- ai_parse_document
- ai_transcribe
 
- Added time travel support for querying historical data:
- Session.table()now supports time travel parameters:- time_travel_mode,- statement,- offset,- timestamp,- timestamp_type, and- stream.
- DataFrameReader.table()supports the same time travel parameters as direct arguments.
- DataFrameReadersupports time travel via option chaining (e.g.,- session.read.option("time_travel_mode", "at").option("offset", -60).table("my_table")).
 
- Added support for specifying the following parameters to DataFrameWriter.copy_into_locationfor validation and writing data to external locations:- validation_mode
- storage_integration
- credentials
- encryption
 
- Added support for Session.directoryandSession.read.directoryto retrieve the list of all files on a stage with metadata.
- Added support for DataFrameReader.jdbc(PrPr) that allows ingesting external data source with jdbc driver.
- Added support for FileOperation.copy_filesto copy files from a source location to an output stage.
- Added support for the following scalar functions in functions.py:- all_user_names
- bitand
- bitand_agg
- bitor
- bitor_agg
- bitxor
- bitxor_agg
- current_account_name
- current_client
- current_ip_address
- current_role_type
- current_organization_name
- current_organization_user
- current_secondary_roles
- current_transaction
- getbit
 
Bug Fixes
- Fixed the repr of TimestampType to match the actual subtype it represents.
- Fixed a bug in DataFrameReader.dbapithat udtf ingestion does not work in stored procedure.
- Fixed a bug in schema inference that caused incorrect stage prefixes to be used.
Improvements
- Enhanced error handling in DataFrameReader.dbapithread-based ingestion to prevent unnecessary operations, which improves resource efficiency.
- Bumped cloudpickle dependency to also support cloudpickle==3.1.1in addition to previous versions.
- Improved DataFrameReader.dbapi(PuPr) ingestion performance for PostgreSQL and MySQL by using server side cursor to fetch data.
Snowpark pandas API Updates
New Features
- Completed support for pd.read_snowflake(),pd.to_iceberg(),pd.to_pandas(),pd.to_snowpark(),pd.to_snowflake(),DataFrame.to_iceberg(),DataFrame.to_pandas(),DataFrame.to_snowpark(),DataFrame.to_snowflake(),Series.to_iceberg(),Series.to_pandas(),Series.to_snowpark(), andSeries.to_snowflake()on the "Pandas" and "Ray" backends. Previously, only some of these functions and methods were supported on the Pandas backend.
- Added support for Index.get_level_values().
Improvements
- Set the default transfer limit in hybrid execution for data leaving Snowflake to 100k, which can be overridden with the SnowflakePandasTransferThreshold environment variable. This configuration is appropriate for scenarios with two available engines, "Pandas" and "Snowflake" on relational workloads.
- Improve import error message by adding --upgradetopip install "snowflake-snowpark-python[modin]"in the error message.
- Reduce the telemetry messages from the modin client by pre-aggregating into 5 second windows and only keeping a narrow band of metrics which are useful for tracking hybrid execution and native pandas performance.
- Set the initial row count only when hybrid execution is enabled. This reduces the number of queries issued for many workloads.
- Add a new test parameter for integration tests to enable hybrid execution.
Bug Fixes
- Raised NotImplementedErrorinstead ofAttributeErroron attempting to call Snowflake extension functions/methodsto_dynamic_table(),cache_result(),to_view(),create_or_replace_dynamic_table(), andcreate_or_replace_view()on dataframes or series using the pandas or ray backends.
1.37.0 (2025-08-18)
Snowpark Python API Updates
New Features
- Added support for the following xpathfunctions infunctions.py:- xpath
- xpath_string
- xpath_boolean
- xpath_int
- xpath_float
- xpath_double
- xpath_long
- xpath_short
 
- Added support for parameter use_vectorized_scannerin functionSession.write_arrow().
- Dataframe profiler adds the following information about each query: describe query time, execution time, and sql query text. To view this information, call session.dataframe_profiler.enable() and call get_execution_profile on a dataframe.
- Added support for DataFrame.col_ilike.
- Added support for non-blocking stored procedure calls that return AsyncJobobjects.- Added block: bool = Trueparameter toSession.call(). Whenblock=False, returns anAsyncJobinstead of blocking until completion.
- Added block: bool = Trueparameter toStoredProcedure.__call__()for async support across both named and anonymous stored procedures.
- Added Session.call_nowait()that is equivalent toSession.call(block=False).
 
- Added 
Bug Fixes
- Fixed a bug in CTE optimization stage where deepcopyof internal plans would cause a memory spike when a dataframe is created locally usingsession.create_dataframe()using a large input data.
- Fixed a bug in DataFrameReader.parquetwhere theignore_caseoption in theinfer_schema_optionswas not respected.
- Fixed a bug that to_pandas()has different format of column name when query result format is set to 'JSON' and 'ARROW'.
Deprecations
- Deprecated pkg_resources.
Dependency Updates
- Added a dependency on protobuf<6.32
Snowpark pandas API Updates
New Features
- Added support for efficient transfer of data between Snowflake and Ray with the DataFrame.set_backendmethod. The installed version ofmodinmust be at least 0.35.0, andraymust be installed.
Improvements
Dependency Updates
- Updated the supported modinversions to >=0.34.0 and <0.36.0 (was previously >= 0.33.0 and <0.35.0).
- Added support for pandas 2.3 when the installed modinversion is at least 0.35.0.
Bug Fixes
- Fixed an issue in hybrid execution mode (PrPr) where pd.to_datetimeandpd.to_timedeltawould unexpectedly raiseIndexError.
- Fixed a bug where pd.explain_switchwould raiseIndexErroror returnNoneif called before any potential switch operations were performed.
- Fixed a bug where calling pd.concat(axis=0)on a dataframe with the default, positional index and a dataframe with a different index would produce invalid SQL.
1.36.0 (2025-08-05)
Snowpark Python API Updates
New Features
- Session.create_dataframenow accepts keyword arguments that are forwarded to the internal call to- Session.write_pandasor- Session.write_arrowwhen creating a DataFrame from a pandas DataFrame or a pyarrow Table.
- Added new APIs for AsyncJob:- AsyncJob.is_failed()returns a- boolindicating if a job has failed. Can be used in combination with- AsyncJob.is_done()to determine if a job is finished and errored.
- AsyncJob.status()returns a string representing the current query status (e.g., "RUNNING", "SUCCESS", "FAILED_WITH_ERROR") for detailed monitoring without calling- result().
 
- Added a dataframe profiler. To use, you can call get_execution_profile() on your desired dataframe. This profiler reports the queries executed to evaluate a dataframe, and statistics about each of the query operators. Currently an experimental feature
- Added support for the following functions in functions.py:- ai_sentiment
 
- Updated the interface for experimental feature context.configure_development_features. All development features are disabled by default unless explicitly enabled by the user.
Snowpark pandas API Updates
New Features
Improvements
- Hybrid execution row estimate improvements and a reduction of eager calls.
- Add a new configuration variable to control transfer costs out of Snowflake when using hybrid execution.
- Added support for creating permanent and immutable UDFs/UDTFs with DataFrame/Series/GroupBy.apply,map, andtransformby passing thesnowflake_udf_paramskeyword argument. See documentation for details.
- Added support for mapping np.unique to DataFrame and Series inputs using pd.unique.
Bug Fixes
- Fixed an issue where Snowpark pandas plugin would unconditionally disable AutoSwitchBackendeven when users had explicitly configured it via environment variables or programmatically.
1.35.0 (2025-07-24)
Snowpark Python API Updates
New Features
- Added support for the following functions in functions.py:- ai_embed
- try_parse_json
 
Bug Fixes
- Fixed a bug in DataFrameReader.dbapi(PrPr) thatdbapifail in python stored procedure with process exit with code 1.
- Fixed a bug in DataFrameReader.dbapi(PrPr) thatcustom_schemaaccept illegal schema.
- Fixed a bug in DataFrameReader.dbapi(PrPr) thatcustom_schemadoes not work when connecting to Postgres and Mysql.
- Fixed a bug in schema inference that would cause it to fail for external stages.
Improvements
- Improved queryparameter inDataFrameReader.dbapi(PrPr) so that parentheses are not needed around the query.
- Improved error experience in DataFrameReader.dbapi(PrPr) when exception happen during inferring schema of target data source.
Snowpark Local Testing Updates
New Features
- Added local testing support for reading files with SnowflakeFileusing local file paths, the Snow URL semantic (snow://...), local testing framework stages, and Snowflake stages (@stage/file_path).
Snowpark pandas API Updates
New Features
- Added support for DataFrame.boxplot.
Improvements
- Reduced the number of UDFs/UDTFs created by repeated calls to applyormapwith the same arguments on Snowpark pandas objects.
- Added an example for reading a file from a stage in the docstring for pd.read_excel.
- Implemented more efficient data transfer between the Snowflake and Ray backends of Modin (requires modin>=0.35.0 to use).
Bug Fixes
- Added an upper bound to the row estimation when the cartesian product from an align or join results in a very large number. This mitigates a performance regression.
- Fix a pd.read_excelbug when reading files inside stage inner directory.
1.34.0 (2025-07-15)
Snowpark Python API Updates
New Features
- Added a new option TRY_CASTtoDataFrameReader. WhenTRY_CASTis True columns are wrapped in aTRY_CASTstatement rather than a hard cast when loading data.
- Added a new option USE_RELAXED_TYPESto theINFER_SCHEMA_OPTIONSofDataFrameReader. When set to True this option casts all strings to max length strings and all numeric types toDoubleType.
- Added debuggability improvements to eagerly validate dataframe schema metadata. Enable it using snowflake.snowpark.context.configure_development_features().
- Added a new function snowflake.snowpark.dataframe.map_in_pandasthat allows users map a function across a dataframe. The mapping function takes an iterator of pandas dataframes as input and provides one as output.
- Added a ttl cache to describe queries. Repeated queries in a 15 second interval will use the cached value rather than requery Snowflake.
- Added a parameter fetch_with_processtoDataFrameReader.dbapi(PrPr) to enable multiprocessing for parallel data fetching in local ingestion. By default, local ingestion uses multithreading. Multiprocessing may improve performance for CPU-bound tasks like Parquet file generation.
- Added a new function snowflake.snowpark.functions.modelthat allows users to call methods of a model.
Improvements
- Added support for row validation using XSD schema using rowValidationXSDPathoption when reading XML files with a row tag usingrowTagoption.
- Improved SQL generation for session.table().sample()to generate a flat SQL statement.
- Added support for complex column expression as input for functions.explode.
- Added debuggability improvements to show which Python lines an SQL compilation error corresponds to. Enable it using snowflake.snowpark.context.configure_development_features(). This feature also depends on AST collection to be enabled in the session which can be done usingsession.ast_enabled = True.
- Set enforce_ordering=True when calling to_snowpark_pandas()from a snowpark dataframe containing DML/DDL queries instead of throwing a NotImplementedError.
Bug Fixes
- Fixed a bug caused by redundant validation when creating an iceberg table.
- Fixed a bug in DataFrameReader.dbapi(PrPr) where closing the cursor or connection could unexpectedly raise an error and terminate the program.
- Fixed ambiguous column errors when using table functions in DataFrame.select()that have output columns matching the input DataFrame's columns. This improvement works when dataframe columns are provided asColumnobjects.
- Fixed a bug where having a NULL in a column with DecimalTypes would cast the column to FloatTypes instead and lead to precision loss.
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug when processing windowed functions that lead to incorrect indexing in results.
- When a scalar numeric is passed to fillna we will ignore non-numeric columns instead of producing an error.
Snowpark pandas API Updates
New Features
- Added support for DataFrame.to_excelandSeries.to_excel.
- Added support for pd.read_feather,pd.read_orc, andpd.read_stata.
- Added support for pd.explain_switch()to return debugging information on hybrid execution decisions.
- Support pd.read_snowflakewhen the global modin backend isPandas.
- Added support for pd.to_dynamic_table,pd.to_iceberg, andpd.to_view.
Improvements
- Added modin telemetry on API calls and hybrid engine switches.
- Show more helpful error messages to Snowflake Notebook users when the modinorpandasversion does not match our requirements.
- Added a data type guard to the cost functions for hybrid execution mode (PrPr) which checks for data type compatibility.
- Added automatic switching to the pandas backend in hybrid execution mode (PrPr) for many methods that are not directly implemented in Snowpark pandas.
- Set the 'type' and other standard fields for Snowpark pandas telemetry.
Dependency Updates
- Added tqdm and ipywidgets as dependencies so that progress bars appear when switching between modin backends.
- Updated the supported modinversions to >=0.33.0 and <0.35.0 (was previously >= 0.32.0 and <0.34.0).
Bug Fixes
- Fixed a bug in hybrid execution mode (PrPr) where certain Series operations would raise TypeError: numpy.ndarray object is not callable.
- Fixed a bug in hybrid execution mode (PrPr) where calling numpy operations like np.whereon modin objects with the Pandas backend would raise anAttributeError. This fix requiresmodinversion 0.34.0 or newer.
- Fixed issue in df.meltwhere the resulting values have an additional suffix applied.
1.33.0 (2025-06-19)
Snowpark Python API Updates
New Features
- Added support for MySQL in DataFrameWriter.dbapi(PrPr) for both Parquet and UDTF-based ingestion.
- Added support for PostgreSQL in DataFrameReader.dbapi(PrPr) for both Parquet and UDTF-based ingestion.
- Added support for Databricks in DataFrameWriter.dbapi(PrPr) for UDTF-based ingestion.
- Added support to DataFrameReaderto enable use ofPATTERNwhen reading files withINFER_SCHEMAenabled.
- Added support for the following AI-powered functions in functions.py:- ai_complete
- ai_similarity
- ai_summarize_agg(originally- summarize_agg)
- different config options for ai_classify
 
- Added support for more options when reading XML files with a row tag using rowTagoption:- Added support for removing namespace prefixes from col names using ignoreNamespaceoption.
- Added support for specifying the prefix for the attribute column in the result table using attributePrefixoption.
- Added support for excluding attributes from the XML element using excludeAttributesoption.
- Added support for specifying the column name for the value when there are attributes in an element that has no child elements using valueTagoption.
- Added support for specifying the value to treat as a nullvalue usingnullValueoption.
- Added support for specifying the character encoding of the XML file using charsetoption.
- Added support for ignoring surrounding whitespace in the XML element using ignoreSurroundingWhitespaceoption.
 
- Added support for removing namespace prefixes from col names using 
- Added support for parameter return_dataframeinSession.call, which can be used to set the return type of the functions to aDataFrameobject.
- Added a new argument to Dataframe.describecalledstrings_include_math_statsthat triggersstddevandmeanto be calculated for String columns.
- Added support for retrieving Edge.propertieswhen retrieving lineage fromDGQLinDataFrame.lineage.trace.
- Added a parameter table_existstoDataFrameWriter.save_as_tablethat allows specifying if a table already exists. This allows skipping a table lookup that can be expensive.
Bug Fixes
- Fixed a bug in DataFrameReader.dbapi(PrPr) where thecreate_connectiondefined as local function was incompatible with multiprocessing.
- Fixed a bug in DataFrameReader.dbapi(PrPr) where databricksTIMESTAMPtype was converted to SnowflakeTIMESTAMP_NTZtype which should beTIMESTAMP_LTZtype.
- Fixed a bug in DataFrameReader.jsonwhere repeated reads with the same reader object would create incorrectly quoted columns.
- Fixed a bug in DataFrame.to_pandas()that would drop column names when converting a dataframe that did not originate from a select statement.
- Fixed a bug that DataFrame.create_or_replace_dynamic_tableraises error when the dataframe contains a UDTF andSELECT *in UDTF not being parsed correctly.
- Fixed a bug where casted columns could not be used in the values-clause of in functions.
Improvements
- Improved the error message for Session.write_pandas()andSession.create_dataframe()when the input pandas DataFrame does not have a column.
- Improved DataFrame.selectwhen the arguments contain a table function with output columns that collide with columns of current dataframe. With the improvement, if user provides non-colliding columns indf.select("col1", "col2", table_func(...))as string arguments, then the query generated by snowpark client will not raise ambiguous column error.
- Improved DataFrameReader.dbapi(PrPr) to use in-memory Parquet-based ingestion for better performance and security.
- Improved DataFrameReader.dbapi(PrPr) to useMATCH_BY_COLUMN_NAME=CASE_SENSITIVEin copy into table operation.
Snowpark Local Testing Updates
New Features
- Added support for snow urls (snow://) in local file testing.
Bug Fixes
- Fixed a bug in Column.isinthat would cause incorrect filtering on joined or previously filtered data.
- Fixed a bug in snowflake.snowpark.functions.concat_wsthat would cause results to have an incorrect index.
Snowpark pandas API Updates
Dependency Updates
- Updated modindependency constraint from 0.32.0 to >=0.32.0, <0.34.0. The latest version tested with Snowpark pandas ismodin0.33.1.
New Features
- Added support for Hybrid Execution (PrPr). By running from modin.config import AutoSwitchBackend; AutoSwitchBackend.enable(), Snowpark pandas will automatically choose whether to run certain pandas operations locally or on Snowflake. This feature is disabled by default.
Improvements
- Set the default value of the indexparameter toFalseforDataFrame.to_view,Series.to_view,DataFrame.to_dynamic_table, andSeries.to_dynamic_table.
- Added iceberg_versionoption to table creation functions.
- Reduced query count for many operations, including insert,repr, andgroupby, that previously issued a query to retrieve the input data's size.
Bug Fixes
- Fixed a bug in Series.wherewhen theotherparameter is an unnamedSeries.
1.32.0 (2025-05-15)
Snowpark Python API Updates
Improvements
- Invoking snowflake system procedures does not invoke an additional describe procedurecall to check the return type of the procedure.
- Added support for Session.create_dataframe()with the stage URL and FILE data type.
- Added support for different modes for dealing with corrupt XML records when reading an XML file using session.read.option('mode', <mode>), option('rowTag', <tag_name>).xml(<stage_file_path>). CurrentlyPERMISSIVE,DROPMALFORMEDandFAILFASTare supported.
- Improved the error message of the XML reader when the specified row tag is not found in the file.
- Improved query generation for Dataframe.dropto useSELECT * EXCLUDE ()to exclude the dropped columns. To enable this feature, setsession.conf.set("use_simplified_query_generation", True).
- Added support for VariantTypetoStructType.from_json
Bug Fixes
- Fixed a bug in DataFrameWriter.dbapi(PrPr) that unicode or double-quoted column name in external database causes error because not quoted correctly.
- Fixed a bug where named fields in nested OBJECT data could cause errors when containing spaces.
- Fixed a bug duplicated native_app_paramsparameters in register udaf function.
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug in snowflake.snowpark.functions.rankthat would cause sort direction to not be respected.
- Fixed a bug in snowflake.snowpark.functions.to_timestamp_*that would cause incorrect results on filtered data.
Snowpark pandas API Updates
New Features
- Added support for dict values in Series.str.get,Series.str.slice, andSeries.str.__getitem__(Series.str[...]).
- Added support for DataFrame.to_html.
- Added support for DataFrame.to_stringandSeries.to_string.
- Added support for reading files from S3 buckets using pd.read_csv.
- Added ENFORCE_EXISTING_FILE_FORMAToption to theDataFrameReader, which allows to read a dataframe only based on an existing file format object when used together withFORMAT_NAME.
Improvements
- Make iceberg_configa required parameter forDataFrame.to_icebergandSeries.to_iceberg.
1.31.1 (2025-05-05)
Snowpark Python API Updates
Bug Fixes
- Updated conda build configuration to deprecate Python 3.8 support, preventing installation in incompatible environments.
1.31.0 (2025-04-24)
Snowpark Python API Updates
New Features
- Added support for restricted callerpermission ofexecute_asargument inStoredProcedure.register().
- Added support for non-select statement in DataFrame.to_pandas().
- Added support for artifact_repositoryparameter toSession.add_packages,Session.add_requirements,Session.get_packages,Session.remove_package, andSession.clear_packages.
- Added support for reading an XML file using a row tag by session.read.option('rowTag', <tag_name>).xml(<stage_file_path>)(experimental).- Each XML record is extracted as a separate row.
- Each field within that record becomes a separate column of type VARIANT, which can be further queried using dot notation, e.g., col(a.b.c).
 
- Added updates to DataFrameReader.dbapi(PrPr):- Added fetch_merge_countparameter for optimizing performance by merging multiple fetched data into a single Parquet file.
- Added support for Databricks.
- Added support for ingestion with Snowflake UDTF.
 
- Added 
- Added support for the following AI-powered functions in functions.py(Private Preview):- prompt
- ai_filter(added support for- prompt()function and image files, and changed the second argument name from- exprto- file)
- ai_classify
 
Improvements
- Renamed the relaxed_orderingparam intoenforce_orderingforDataFrame.to_snowpark_pandas. Also the new default values isenforce_ordering=Falsewhich has the opposite effect of the previous default value,relaxed_ordering=False.
- Improved DataFrameReader.dbapi(PrPr) reading performance by setting the defaultfetch_sizeparameter value to 1000.
- Improve the error message for invalid identifier SQL error by suggesting the potentially matching identifiers.
- Reduced the number of describe queries issued when creating a DataFrame from a Snowflake table using session.table.
- Improved performance and accuracy of DataFrameAnalyticsFunctions.time_series_agg().
Bug Fixes
- Fixed a bug in DataFrame.group_by().pivot().aggwhen the pivot column and aggregate column are the same.
- Fixed a bug in DataFrameReader.dbapi(PrPr) where aTypeErrorwas raised whencreate_connectionreturned a connection object of an unsupported driver type.
- Fixed a bug where df.limit(0)call would not properly apply.
- Fixed a bug in DataFrameWriter.save_as_tablethat caused reserved names to throw errors when using append mode.
Deprecations
- Deprecated support for Python3.8.
- Deprecated argument sliding_intervalinDataFrameAnalyticsFunctions.time_series_agg().
Snowpark Local Testing Updates
New Features
- Added support for Interval expression to Window.range_between.
- Added support for array_constructfunction.
Bug Fixes
- Fixed a bug in local testing where transient __pycache__directory was unintentionally copied during stored procedure execution via import.
- Fixed a bug in local testing that created incorrect result for Column.likecalls.
- Fixed a bug in local testing that caused Column.getItemandsnowpark.snowflake.functions.getto raiseIndexErrorrather than return null.
- Fixed a bug in local testing where df.limit(0)call would not properly apply.
- Fixed a bug in local testing where a Table.mergeinto an empty table would cause an exception.
Snowpark pandas API Updates
Dependency Updates
- Updated modinfrom 0.30.1 to 0.32.0.
- Added support for numpy2.0 and above.
New Features
- Added support for DataFrame.create_or_replace_viewandSeries.create_or_replace_view.
- Added support for DataFrame.create_or_replace_dynamic_tableandSeries.create_or_replace_dynamic_table.
- Added support for DataFrame.to_viewandSeries.to_view.
- Added support for DataFrame.to_dynamic_tableandSeries.to_dynamic_table.
- Added support for DataFrame.groupby.resamplefor aggregationsmax,mean,median,min, andsum.
- Added support for reading stage files using:
- pd.read_excel
- pd.read_html
- pd.read_pickle
- pd.read_sas
- pd.read_xml
 
- Added support for DataFrame.to_icebergandSeries.to_iceberg.
- Added support for dict values in Series.str.len.
Improvements
- Improve performance of DataFrame.groupby.applyandSeries.groupby.applyby avoiding expensive pivot step.
- Added estimate for row count upper bound to OrderedDataFrameto enable better engine switching. This could potentially result in increased query counts.
- Renamed the relaxed_orderingparam intoenforce_orderingforpd.read_snowflake. Also the new default value isenforce_ordering=Falsewhich has the opposite effect of the previous default value,relaxed_ordering=False.
Bug Fixes
- Fixed a bug for pd.read_snowflakewhen reading iceberg tables andenforce_ordering=True.
1.30.0 (2025-03-27)
Snowpark Python API Updates
New Features
- Added Support for relaxed consistency and ordering guarantees in Dataframe.to_snowpark_pandasby introducing the new parameterrelaxed_ordering.
- DataFrameReader.dbapi(PrPr) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.
Improvements
- Improved query generation for Dataframe.stat.sample_byto generate a single flat query that scales well with largefractionsdictionary compared to older method of creating a UNION ALL subquery for each key infractions. To enable this feature, setsession.conf.set("use_simplified_query_generation", True).
- Improved performance of DataFrameReader.dbapiby enable vectorized option when copy parquet file into table.
- Improved query generation for DataFrame.random_splitin the following ways. They can be enabled by settingsession.conf.set("use_simplified_query_generation", True):- Removed the need to cache_resultin the internal implementation of the input dataframe resulting in a pure lazy dataframe operation.
- The seedargument now behaves as expected with repeatable results across multiple calls and sessions.
 
- Removed the need to 
- DataFrame.fillnaand- DataFrame.replacenow both support fitting- intand- floatinto- Decimalcolumns if- include_decimalis set to True.
- Added documentation for the following UDF and stored procedure functions in files.pyas a result of their General Availability.- SnowflakeFile.write
- SnowflakeFile.writelines
- SnowflakeFile.writeable
 
- Minor documentation changes for SnowflakeFileandSnowflakeFile.open()
Bug Fixes
- Fixed a bug for the following functions that raised errors .cast()is applied to their output- from_json
- size
 
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug in aggregation that caused empty groups to still produce rows.
- Fixed a bug in Dataframe.except_that would cause rows to be incorrectly dropped.
- Fixed a bug that caused to_timestampto fail when casting filtered columns.
Snowpark pandas API Updates
New Features
- Added support for list values in Series.str.__getitem__(Series.str[...]).
- Added support for pd.Grouperobjects in group by operations. Whenfreqis specified, the default values of thesort,closed,label, andconventionarguments are supported;originis supported when it isstartorstart_day.
- Added support for relaxed consistency and ordering guarantees in pd.read_snowflakefor both named data sources (e.g., tables and views) and query data sources by introducing the new parameterrelaxed_ordering.
Improvements
- Raise a warning whenever QUOTED_IDENTIFIERS_IGNORE_CASEis found to be set, ask user to unset it.
- Improved how a missing index_labelinDataFrame.to_snowflakeandSeries.to_snowflakeis handled whenindex=True. Instead of raising aValueError, system-defined labels are used for the index columns.
- Improved error message for groupby or DataFrame or Series.aggwhen the function name is not supported.
1.29.1 (2025-03-12)
Snowpark Python API Updates
Bug Fixes
- Fixed a bug in DataFrameReader.dbapi(PrPr) that prevents usage in stored procedure and snowbooks.
1.29.0 (2025-03-05)
Snowpark Python API Updates
New Features
- Added support for the following AI-powered functions in functions.py(Private Preview):- ai_filter
- ai_agg
- summarize_agg
 
- Added support for the new FILE SQL type support, with the following related functions in functions.py(Private Preview):- fl_get_content_type
- fl_get_etag
- fl_get_file_type
- fl_get_last_modified
- fl_get_relative_path
- fl_get_scoped_file_url
- fl_get_size
- fl_get_stage
- fl_get_stage_file_url
- fl_is_audio
- fl_is_compressed
- fl_is_document
- fl_is_image
- fl_is_video
 
- Added support for importing third-party packages from PyPi using Artifact Repository (Private Preview):
- Use keyword arguments artifact_repositoryandartifact_repository_packagesto specify your artifact repository and packages respectively when registering stored procedures or user defined functions.
- Supported APIs are:
- Session.sproc.register
- Session.udf.register
- Session.udaf.register
- Session.udtf.register
- functions.sproc
- functions.udf
- functions.udaf
- functions.udtf
- functions.pandas_udf
- functions.pandas_udtf
 
 
- Use keyword arguments 
Bug Fixes
- Fixed a bug where creating a Dataframe with large number of values raised Unsupported feature 'SCOPED_TEMPORARY'.error if thread-safe session was disabled.
- Fixed a bug where df.describeraised internal SQL execution error when the dataframe is created from reading a stage file and CTE optimization is enabled.
- Fixed a bug where df.order_by(A).select(B).distinct()would generate invalid SQL when simplified query generation was enabled usingsession.conf.set("use_simplified_query_generation", True).
- Disabled simplified query generation by default.
Improvements
- Improved version validation warnings for snowflake-snowpark-pythonpackage compatibility when registering stored procedures. Now, warnings are only triggered if the major or minor version does not match, while bugfix version differences no longer generate warnings.
- Bumped cloudpickle dependency to also support cloudpickle==3.0.0in addition to previous versions.
Snowpark Local Testing Updates
New Features
- Added support for literal values to range_betweenwindow function.
Snowpark pandas API Updates
New Features
- Added support for list values in Series.str.slice.
- Added support for applying Snowflake Cortex functions ClassifyText,Translate, andExtractAnswer.
- Added support for Series.hist.
Improvements
- Improved performance of DataFrame.groupby.transformandSeries.groupby.transformby avoiding expensive pivot step.
- Improve error message for pd.to_snowflake,DataFrame.to_snowflake, andSeries.to_snowflakewhen the table does not exist.
- Improve readability of docstring for the if_existsparameter inpd.to_snowflake,DataFrame.to_snowflake, andSeries.to_snowflake.
- Improve error message for all pandas functions that use UDFs with Snowpark objects.
Bug Fixes
- Fixed a bug in Series.rename_axiswhere anAttributeErrorwas being raised.
- Fixed a bug where pd.get_dummiesdidn't ignore NULL/NaN values by default.
- Fixed a bug where repeated calls to pd.get_dummiesresults in 'Duplicated column name error'.
- Fixed a bug in pd.get_dummieswhere passing list of columns generated incorrect column labels in output DataFrame.
- Update pd.get_dummiesto return bool values instead of int.
1.28.0 (2025-02-20)
Snowpark Python API Updates
New Features
- Added support for the following functions in functions.py- normal
- randn
 
- Added support for allow_missing_columnsparameter toDataframe.union_by_nameandDataframe.union_all_by_name.
Improvements
- Improved query generation for Dataframe.distinctto generateSELECT DISTINCTinstead ofSELECTwithGROUP BYall columns. To disable this feature, setsession.conf.set("use_simplified_query_generation", False).
Deprecations
- Deprecated Snowpark Python function snowflake_cortex_summarize. Users can install snowflake-ml-python and use the snowflake.cortex.summarize function instead.
- Deprecated Snowpark Python function snowflake_cortex_sentiment. Users can install snowflake-ml-python and use the snowflake.cortex.sentiment function instead.
Bug Fixes
- Fixed a bug where session-level query tag was overwritten by a stacktrace for dataframes that generate multiple queries. Now, the query tag will only be set to the stacktrace if session.conf.set("collect_stacktrace_in_query_tag", True).
- Fixed a bug in Session._write_pandaswhere it was erroneously passinguse_logical_typeparameter toSession._write_modin_pandas_helperwhen writing a Snowpark pandas object.
- Fixed a bug in options sql generation that could cause multiple values to be formatted incorrectly.
- Fixed a bug in Session.catalogwhere empty strings for database or schema were not handled correctly and were generating erroneous sql statements.
Experimental Features
- Added support for writing pyarrow Tables to Snowflake tables.
Snowpark pandas API Updates
New Features
- Added support for applying Snowflake Cortex functions SummarizeandSentiment.
- Added support for list values in Series.str.get.
Bug Fixes
- Fixed a bug in applywhere kwargs were not being correctly passed into the applied function.
Snowpark Local Testing Updates
New Features
- Added support for the following functions
- hour
- minute
 
- Added support for NULL_IF parameter to csv reader.
- Added support for date_format,datetime_format, andtimestamp_formatoptions when loading csvs.
Bug Fixes
- Fixed a bug in Dataframe.join that caused columns to have incorrect typing.
- Fixed a bug in when statements that caused incorrect results in the otherwise clause.
1.27.0 (2025-02-03)
Snowpark Python API Updates
New Features
- Added support for the following functions in functions.py- array_reverse
- divnull
- map_cat
- map_contains_key
- map_keys
- nullifzero
- snowflake_cortex_sentiment
- acosh
- asinh
- atanh
- bit_length
- bitmap_bit_position
- bitmap_bucket_number
- bitmap_construct_agg
- bitshiftright_unsigned
- cbrt
- equal_null
- from_json
- ifnull
- localtimestamp
- max_by
- min_by
- nth_value
- nvl
- octet_length
- position
- regr_avgx
- regr_avgy
- regr_count
- regr_intercept
- regr_r2
- regr_slope
- regr_sxx
- regr_sxy
- regr_syy
- try_to_binary
- base64
- base64_decode_string
- base64_encode
- editdistance
- hex
- hex_encode
- instr
- log1p
- log2
- log10
- percentile_approx
- unbase64
 
- Added support for seedargument inDataFrame.stat.sample_by. Note that it only supports aTableobject, and will be ignored for aDataFrameobject.
- Added support for specifying a schema string (including implicit struct syntax) when calling DataFrame.create_dataframe.
- Added support for DataFrameWriter.insert_into/insertInto. This method also supports local testing mode.
- Added support for DataFrame.create_temp_viewto create a temporary view. It will fail if the view already exists.
- Added support for multiple columns in the functions map_catandmap_concat.
- Added an option keep_column_orderfor keeping original column order inDataFrame.with_columnandDataFrame.with_columns.
- Added options to column casts that allow renaming or adding fields in StructType columns.
- Added support for contains_nullparameter to ArrayType.
- Added support for creating a temporary view via DataFrame.create_or_replace_temp_viewfrom a DataFrame created by reading a file from a stage.
- Added support for value_contains_nullparameter to MapType.
- Added support for using Columnobject inColumn.in_andfunctions.in_.
- Added interactiveto telemetry that indicates whether the current environment is an interactive one.
- Allow session.file.getin a Native App to read file paths starting with/from the current version
- Added support for multiple aggregation functions after DataFrame.pivot.
Experimental Features
- Added Catalogclass to manage snowflake objects. It can be accessed viaSession.catalog.- snowflake.coreis a dependency required for this feature.
 
- Allow user input schema when reading JSON file on stage.
- Added support for specifying a schema string (including implicit struct syntax) when calling DataFrame.create_dataframe.
Improvements
- Updated README.md to include instructions on how to verify package signatures using cosign.
Bug Fixes
- Fixed a bug in local testing mode that caused a column to contain None when it should contain 0.
- Fixed a bug in StructField.from_jsonthat prevented TimestampTypes withtzinfofrom being parsed correctly.
- Fixed a bug in function date_formatthat caused an error when the input column was date type or timestamp type.
- Fixed a bug in dataframe that null value can be inserted in a non-nullable column.
- Fixed a bug in replaceandlitwhich raised type hint assertion error when passingColumnexpression objects.
- Fixed a bug in pandas_udfandpandas_udtfwheresessionparameter was erroneously ignored.
- Fixed a bug that raised incorrect type conversion error for system function called through session.call.
Snowpark pandas API Updates
New Features
- Added support for Series.str.ljustandSeries.str.rjust.
- Added support for Series.str.center.
- Added support for Series.str.pad.
- Added support for applying Snowpark Python function snowflake_cortex_sentiment.
- Added support for DataFrame.map.
- Added support for DataFrame.from_dictandDataFrame.from_records.
- Added support for mixed case field names in struct type columns.
- Added support for SeriesGroupBy.unique
- Added support for Series.dt.strftimewith the following directives:- %d: Day of the month as a zero-padded decimal number.
- %m: Month as a zero-padded decimal number.
- %Y: Year with century as a decimal number.
- %H: Hour (24-hour clock) as a zero-padded decimal number.
- %M: Minute as a zero-padded decimal number.
- %S: Second as a zero-padded decimal number.
- %f: Microsecond as a decimal number, zero-padded to 6 digits.
- %j: Day of the year as a zero-padded decimal number.
- %X: Locale’s appropriate time representation.
- %%: A literal '%' character.
 
- Added support for Series.between.
- Added support for include_groups=FalseinDataFrameGroupBy.apply.
- Added support for expand=TrueinSeries.str.split.
- Added support for DataFrame.popandSeries.pop.
- Added support for firstandlastinDataFrameGroupBy.aggandSeriesGroupBy.agg.
- Added support for Index.drop_duplicates.
- Added support for aggregations "count","median",np.median,"skew","std",np.std"var", andnp.varinpd.pivot_table(),DataFrame.pivot_table(), andpd.crosstab().
Improvements
- Improve performance of DataFrame.map,Series.applyandSeries.mapmethods by mapping numpy functions to snowpark functions if possible.
- Added documentation for DataFrame.map.
- Improve performance of DataFrame.applyby mapping numpy functions to snowpark functions if possible.
- Added documentation on the extent of Snowpark pandas interoperability with scikit-learn.
- Infer return type of functions in Series.map,Series.applyandDataFrame.mapif type-hint is not provided.
- Added call_countto telemetry that counts method calls including interchange protocol calls.
1.26.0 (2024-12-05)
Snowpark Python API Updates
New Features
- Added support for property versionand class methodget_active_sessionforSessionclass.
- Added new methods and variables to enhance data type handling and JSON serialization/deserialization:
- To DataType, its derived classes, andStructField:- type_name: Returns the type name of the data.
- simple_string: Provides a simple string representation of the data.
- json_value: Returns the data as a JSON-compatible value.
- json: Converts the data to a JSON string.
 
- To ArrayType,MapType,StructField,PandasSeriesType,PandasDataFrameTypeandStructType:- from_json: Enables these types to be created from JSON data.
 
- To MapType:- keyType: keys of the map
- valueType: values of the map
 
 
- To 
- Added support for method appNameinSessionBuilder.
- Added support for include_nullsargument inDataFrame.unpivot.
- Added support for following functions in functions.py:- sizeto get size of array, object, or map columns.
- collect_listan alias of- array_agg.
- substringmakes- lenargument optional.
 
- Added parameter ast_enabledto session for internal usage (default:False).
Improvements
- Added support for specifying the following to DataFrame.create_or_replace_dynamic_table:- iceberg_configA dictionary that can hold the following iceberg configuration options:- external_volume
- catalog
- base_location
- catalog_sync
- storage_serialization_policy
 
 
- Added support for nested data types to DataFrame.print_schema
- Added support for levelparameter toDataFrame.print_schema
- Improved flexibility of DataFrameReaderandDataFrameWriterAPI by adding support for the following:- Added formatmethod toDataFrameReaderandDataFrameWriterto specify file format when loading or unloading results.
- Added loadmethod toDataFrameReaderto work in conjunction withformat.
- Added savemethod toDataFrameWriterto work in conjunction withformat.
- Added support to read keyword arguments to optionsmethod forDataFrameReaderandDataFrameWriter.
 
- Added 
- Relaxed the cloudpickle dependency for Python 3.11 to simplify build requirements. However, for Python 3.11, cloudpickle==2.2.1remains the only supported version.
Bug Fixes
- Removed warnings that dynamic pivot features were in private preview, because dynamic pivot is now generally available.
- Fixed a bug in session.read.optionswhereFalseBoolean values were incorrectly parsed asTruein the generated file format.
Dependency Updates
- Added a runtime dependency on python-dateutil.
Snowpark pandas API Updates
New Features
- Added partial support for Series.mapwhenargis a pandasSeriesor acollections.abc.Mapping. No support for instances ofdictthat implement__missing__but are not instances ofcollections.defaultdict.
- Added support for DataFrame.alignandSeries.alignforaxis=1andaxis=None.
- Added support for pd.json_normalize.
- Added support for GroupBy.pct_changewithaxis=0,freq=None, andlimit=None.
- Added support for DataFrameGroupBy.__iter__andSeriesGroupBy.__iter__.
- Added support for np.sqrt,np.trunc,np.floor, numpy trig functions,np.exp,np.abs,np.positiveandnp.negative.
- Added partial support for the dataframe interchange protocol method
DataFrame.__dataframe__().
Bug Fixes
- Fixed a bug in df.locwhere setting a single column from a series results in unexpectedNonevalues.
Improvements
- Use UNPIVOT INCLUDE NULLS for unpivot operations in pandas instead of sentinel values.
- Improved documentation for pd.read_excel.
1.25.0 (2024-11-14)
Snowpark Python API Updates
New Features
- Added the following new functions in snowflake.snowpark.dataframe:- map
 
- Added support for passing parameter include_errortoSession.query_historyto record queries that have error during execution.
Improvements
- When target stage is not set in profiler, a default stage from Session.get_session_stageis used instead of raisingSnowparkSQLException.
- Allowed lower case or mixed case input when calling Session.stored_procedure_profiler.set_active_profiler.
- Added distributed tracing using open telemetry APIs for action function in DataFrame:- cache_result
 
- Removed opentelemetry warning from logging.
Bug Fixes
- Fixed the pre-action and post-action query propagation when Inexpression were used in selects.
- Fixed a bug that raised error AttributeErrorwhile callingSession.stored_procedure_profiler.get_outputwhenSession.stored_procedure_profileris disabled.
Dependency Updates
- Added a dependency on protobuf>=5.28andtzlocalat runtime.
- Added a dependency on protoc-wheel-0for the development profile.
- Require snowflake-connector-python>=3.12.0, <4.0.0(was>=3.10.0).
Snowpark pandas API Updates
Dependency Updates
- Updated modinfrom 0.28.1 to 0.30.1.
- Added support for all pandas2.2.x versions.
New Features
- Added support for Index.to_numpy.
- Added support for DataFrame.alignandSeries.alignforaxis=0.
- Added support for sizeinGroupBy.aggregate,DataFrame.aggregate, andSeries.aggregate.
- Added support for snowflake.snowpark.functions.window
- Added support for pd.read_pickle(Uses native pandas for processing).
- Added support for pd.read_html(Uses native pandas for processing).
- Added support for pd.read_xml(Uses native pandas for processing).
- Added support for aggregation functions "size"andleninGroupBy.aggregate,DataFrame.aggregate, andSeries.aggregate.
- Added support for list values in Series.str.len.
Bug Fixes
- Fixed a bug where aggregating a single-column dataframe with a single callable function (e.g. pd.DataFrame([0]).agg(np.mean)) would fail to transpose the result.
- Fixed bugs where DataFrame.dropna()would:- Treat an empty subset(e.g.[]) as if it specified all columns instead of no columns.
- Raise a TypeErrorfor a scalarsubsetinstead of filtering on just that column.
- Raise a ValueErrorfor asubsetof typepandas.Indexinstead of filtering on the columns in the index.
 
- Treat an empty 
- Disable creation of scoped read only table to mitigate Disable creation of scoped read only table to mitigate TableNotFoundErrorwhen using dynamic pivot in notebook environment.
- Fixed a bug when concat dataframe or series objects are coming from the same dataframe when axis = 1.
Improvements
- Improve np.where with scalar x value by eliminating unnecessary join and temp table creation.
- Improve get_dummies performance by flattening the pivot with join.
- Improve align performance when aligning on row position column by removing unnecessary window functions.
Snowpark Local Testing Updates
New Features
- Added support for patching functions that are unavailable in the snowflake.snowpark.functionsmodule.
- Added support for snowflake.snowpark.functions.any_value
Bug Fixes
- Fixed a bug where Table.updatecould not handleVariantType,MapType, andArrayTypedata types.
- Fixed a bug where column aliases were incorrectly resolved in DataFrame.join, causing errors when selecting columns from a joined DataFrame.
- Fixed a bug where Table.updateandTable.mergecould fail if the target table's index was not the defaultRangeIndex.
1.24.0 (2024-10-28)
Snowpark Python API Updates
New Features
- Updated Sessionclass to be thread-safe. This allows concurrent DataFrame transformations, DataFrame actions, UDF and stored procedure registration, and concurrent file uploads when using the sameSessionobject.- The feature is disabled by default and can be enabled by setting FEATURE_THREAD_SAFE_PYTHON_SESSIONtoTruefor account.
- Updating session configurations, like changing database or schema, when multiple threads are using the session may lead to unexpected behavior.
- When enabled, some internally created temporary table names returned from DataFrame.queriesAPI are not deterministic, and may be different when DataFrame actions are executed. This does not affect explicit user-created temporary tables.
 
- The feature is disabled by default and can be enabled by setting 
- Added support for 'Service' domain to session.lineage.traceAPI.
- Added support for copy_grantsparameter when registering UDxF and stored procedures.
- Added support for the following methods in DataFrameWriterto support daisy-chaining:- option
- options
- partition_by
 
- Added support for snowflake_cortex_summarize.
Improvements
- Improved the following new capability for function snowflake.snowpark.functions.array_removeit is now possible to use in python.
- Disables sql simplification when sort is performed after limit.
- Previously, df.sort().limit()anddf.limit().sort()generates the same query with sort in front of limit. Now,df.limit().sort()will generate query that readsdf.limit().sort().
- Improve performance of generated query for df.limit().sort(), because limit stops table scanning as soon as the number of records is satisfied.
 
- Previously, 
- Added a client side error message for when an invalid stage location is passed to DataFrame read functions.
Bug Fixes
- Fixed a bug where the automatic cleanup of temporary tables could interfere with the results of async query execution.
- Fixed a bug in DataFrame.analytics.time_series_aggfunction to handle multiple data points in same sliding interval.
- Fixed a bug that created inconsistent casing in field names of structured objects in iceberg schemas.
Deprecations
- Deprecated warnings will be triggered when using snowpark-python with Python 3.8. For more details, please refer to https://docs.snowflake.com/en/developer-guide/python-runtime-support-policy.
Snowpark pandas API Updates
New Features
- Added support for np.subtract,np.multiply,np.divide, andnp.true_divide.
- Added support for tracking usages of __array_ufunc__.
- Added numpy compatibility support for np.float_power,np.mod,np.remainder,np.greater,np.greater_equal,np.less,np.less_equal,np.not_equal, andnp.equal.
- Added numpy compatibility support for np.log,np.log2, andnp.log10
- Added support for DataFrameGroupBy.bfill,SeriesGroupBy.bfill,DataFrameGroupBy.ffill, andSeriesGroupBy.ffill.
- Added support for onparameter withResampler.
- Added support for timedelta inputs in value_counts().
- Added support for applying Snowpark Python function snowflake_cortex_summarize.
- Added support for DataFrame.attrsandSeries.attrs.
- Added support for DataFrame.style.
- Added numpy compatibility support for np.full_like
Improvements
- Improved generated SQL query for headandilocwhen the row key is a slice.
- Improved error message when passing an unknown timezone to tz_convertandtz_localizeinSeries,DataFrame,Series.dt, andDatetimeIndex.
- Improved documentation for tz_convertandtz_localizeinSeries,DataFrame,Series.dt, andDatetimeIndexto specify the supported timezone formats.
- Added additional kwargs support for df.applyandseries.apply( as well asmapandapplymap) when using snowpark functions. This allows for some position independent compatibility between apply and functions where the first argument is not a pandas object.
- Improved generated SQL query for ilocandiatwhen the row key is a scalar.
- Removed all joins in iterrows.
- Improved documentation for Series.mapto reflect the unsupported features.
- Added support for np.may_share_memorywhich is used internally by many scikit-learn functions. This method will always return false when called with a Snowpark pandas object.
Bug Fixes
- Fixed a bug where DataFrameandSeriespct_change()would raiseTypeErrorwhen input contained timedelta columns.
- Fixed a bug where replace()would sometimes propagateTimedeltatypes incorrectly throughreplace(). Instead raiseNotImplementedErrorforreplace()onTimedelta.
- Fixed a bug where DataFrameandSeriesround()would raiseAssertionErrorforTimedeltacolumns. Instead raiseNotImplementedErrorforround()onTimedelta.
- Fixed a bug where reindexfails when the new index is a Series with non-overlapping types from the original index.
- Fixed a bug where calling __getitem__on a DataFrameGroupBy object always returned a DataFrameGroupBy object ifas_index=False.
- Fixed a bug where inserting timedelta values into an existing column would silently convert the values to integers instead of raising NotImplementedError.
- Fixed a bug where DataFrame.shift()on axis=0 and axis=1 would fail to propagate timedelta types.
- DataFrame.abs(),- DataFrame.__neg__(),- DataFrame.stack(), and- DataFrame.unstack()now raise- NotImplementedErrorfor timedelta inputs instead of failing to propagate timedelta types.
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug where DataFrame.aliasraisesKeyErrorfor input column name.
- Fixed a bug where to_csvon Snowflake stage fails when data contains empty strings.
1.23.0 (2024-10-09)
Snowpark Python API Updates
New Features
- Added the following new functions in snowflake.snowpark.functions:- make_interval
 
- Added support for using Snowflake Interval constants with Window.range_between()when the order by column is TIMESTAMP or DATE type.
- Added support for file writes. This feature is currently in private preview.
- Added thread_idtoQueryRecordto track the thread id submitting the query history.
- Added support for Session.stored_procedure_profiler.
Improvements
Bug Fixes
- Fixed a bug where registering a stored procedure or UDxF with type hints would give a warning 'NoneType' has no len() when trying to read default values from function.
Snowpark pandas API Updates
New Features
- Added support for TimedeltaIndex.meanmethod.
- Added support for some cases of aggregating Timedeltacolumns onaxis=0withaggoraggregate.
- Added support for by,left_by,right_by,left_index, andright_indexforpd.merge_asof.
- Added support for passing parameter include_describetoSession.query_history.
- Added support for DatetimeIndex.meanandDatetimeIndex.stdmethods.
- Added support for Resampler.asfreq,Resampler.indices,Resampler.nunique, andResampler.quantile.
- Added support for resamplefrequencyW,ME,YEwithclosed = "left".
- Added support for DataFrame.rolling.corrandSeries.rolling.corrforpairwise = Falseand intwindow.
- Added support for string time-based windowandmin_periods = NoneforRolling.
- Added support for DataFrameGroupBy.fillnaandSeriesGroupBy.fillna.
- Added support for constructing SeriesandDataFrameobjects with the lazyIndexobject asdata,index, andcolumnsarguments.
- Added support for constructing SeriesandDataFrameobjects withindexandcolumnvalues not present inDataFrame/Seriesdata.
- Added support for pd.read_sas(Uses native pandas for processing).
- Added support for applying rolling().count()andexpanding().count()toTimedeltaseries and columns.
- Added support for tzin bothpd.date_rangeandpd.bdate_range.
- Added support for Series.items.
- Added support for errors="ignore"inpd.to_datetime.
- Added support for DataFrame.tz_localizeandSeries.tz_localize.
- Added support for DataFrame.tz_convertandSeries.tz_convert.
- Added support for applying Snowpark Python functions (e.g., sin) inSeries.map,Series.apply,DataFrame.applyandDataFrame.applymap.
Improvements
- Improved to_pandasto persist the original timezone offset for TIMESTAMP_TZ type.
- Improved dtyperesults for TIMESTAMP_TZ type to show correct timezone offset.
- Improved dtyperesults for TIMESTAMP_LTZ type to show correct timezone.
- Improved error message when passing non-bool value to numeric_onlyfor groupby aggregations.
- Removed unnecessary warning about sort algorithm in sort_values.
- Use SCOPED object for internal create temp tables. The SCOPED objects will be stored sproc scoped if created within stored sproc, otherwise will be session scoped, and the object will be automatically cleaned at the end of the scope.
- Improved warning messages for operations that lead to materialization with inadvertent slowness.
- Removed unnecessary warning message about convert_dtypeinSeries.apply.
Bug Fixes
- Fixed a bug where an Indexobject created from aSeries/DataFrameincorrectly updates theSeries/DataFrame's index name after an inplace update has been applied to the originalSeries/DataFrame.
- Suppressed an unhelpful SettingWithCopyWarningthat sometimes appeared when printingTimedeltacolumns.
- Fixed inplaceargument forSeriesobjects derived from otherSeriesobjects.
- Fixed a bug where Series.sort_valuesfailed if series name overlapped with index column name.
- Fixed a bug where transposing a dataframe would map Timedeltaindex levels to integer column levels.
- Fixed a bug where Resamplermethods on timedelta columns would produce integer results.
- Fixed a bug where pd.to_numeric()would leaveTimedeltainputs asTimedeltainstead of converting them to integers.
- Fixed locset when setting a single row, or multiple rows, of a DataFrame with a Series value.
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug where nullable columns were annotated wrongly.
- Fixed a bug where the date_addanddate_subfunctions failed forNULLvalues.
- Fixed a bug where equal_nullcould fail inside a merge statement.
- Fixed a bug where row_numbercould fail inside a Window function.
- Fixed a bug where updates could fail when the source is the result of a join.
1.22.1 (2024-09-11)
This is a re-release of 1.22.0. Please refer to the 1.22.0 release notes for detailed release content.
1.22.0 (2024-09-10)
Snowpark Python API Updates
New Features
- Added the following new functions in snowflake.snowpark.functions:- array_remove
- ln
 
Improvements
- Improved documentation for Session.write_pandasby makinguse_logical_typeoption more explicit.
- Added support for specifying the following to DataFrameWriter.save_as_table:- enable_schema_evolution
- data_retention_time
- max_data_extension_time
- change_tracking
- copy_grants
- iceberg_configA dicitionary that can hold the following iceberg configuration options:- external_volume
- catalog
- base_location
- catalog_sync
- storage_serialization_policy
 
 
- Added support for specifying the following to DataFrameWriter.copy_into_table:- iceberg_configA dicitionary that can hold the following iceberg configuration options:- external_volume
- catalog
- base_location
- catalog_sync
- storage_serialization_policy
 
 
- Added support for specifying the following parameters to DataFrame.create_or_replace_dynamic_table:- mode
- refresh_mode
- initialize
- clustering_keys
- is_transient
- data_retention_time
- max_data_extension_time
 
Bug Fixes
- Fixed a bug in session.read.csvthat caused an error when settingPARSE_HEADER = Truein an externally defined file format.
- Fixed a bug in query generation from set operations that allowed generation of duplicate queries when children have common subqueries.
- Fixed a bug in session.get_session_stagethat referenced a non-existing stage after switching database or schema.
- Fixed a bug where calling DataFrame.to_snowpark_pandaswithout explicitly initializing the Snowpark pandas plugin caused an error.
- Fixed a bug where using the explodefunction in dynamic table creation caused a SQL compilation error due to improper boolean type casting on theouterparameter.
Snowpark Local Testing Updates
New Features
- Added support for type coercion when passing columns as input to UDF calls.
- Added support for Index.identical.
Bug Fixes
- Fixed a bug where the truncate mode in DataFrameWriter.save_as_tableincorrectly handled DataFrames containing only a subset of columns from the existing table.
- Fixed a bug where function to_timestampdoes not set the default timezone of the column datatype.
Snowpark pandas API Updates
New Features
- Added limited support for the Timedeltatype, including the following features. Snowpark pandas will raiseNotImplementedErrorfor unsupportedTimedeltause cases.- supporting tracking the Timedelta type through copy,cache_result,shift,sort_index,assign,bfill,ffill,fillna,compare,diff,drop,dropna,duplicated,empty,equals,insert,isin,isna,items,iterrows,join,len,mask,melt,merge,nlargest,nsmallest,to_pandas.
- converting non-timedelta to timedelta via astype.
- NotImplementedErrorwill be raised for the rest of methods that do not support- Timedelta.
- support for subtracting two timestamps to get a Timedelta.
- support indexing with Timedelta data columns.
- support for adding or subtracting timestamps and Timedelta.
- support for binary arithmetic between two Timedeltavalues.
- support for binary arithmetic and comparisons between Timedeltavalues and numeric values.
- support for lazy TimedeltaIndex.
- support for pd.to_timedelta.
- support for GroupByaggregationsmin,max,mean,idxmax,idxmin,std,sum,median,count,any,all,size,nunique,head,tail,aggregate.
- support for GroupByfiltrationsfirstandlast.
- support for TimedeltaIndexattributes:days,seconds,microsecondsandnanoseconds.
- support for diffwith timestamp columns onaxis=0andaxis=1
- support for TimedeltaIndexmethods:ceil,floorandround.
- support for TimedeltaIndex.total_secondsmethod.
 
- supporting tracking the Timedelta type through 
- Added support for index's arithmetic and comparison operators.
- Added support for Series.dt.round.
- Added documentation pages for DatetimeIndex.
- Added support for Index.name,Index.names,Index.rename, andIndex.set_names.
- Added support for Index.__repr__.
- Added support for DatetimeIndex.month_nameandDatetimeIndex.day_name.
- Added support for Series.dt.weekday,Series.dt.time, andDatetimeIndex.time.
- Added support for Index.minandIndex.max.
- Added support for pd.merge_asof.
- Added support for Series.dt.normalizeandDatetimeIndex.normalize.
- Added support for Index.is_boolean,Index.is_integer,Index.is_floating,Index.is_numeric, andIndex.is_object.
- Added support for DatetimeIndex.round,DatetimeIndex.floorandDatetimeIndex.ceil.
- Added support for Series.dt.days_in_monthandSeries.dt.daysinmonth.
- Added support for DataFrameGroupBy.value_countsandSeriesGroupBy.value_counts.
- Added support for Series.is_monotonic_increasingandSeries.is_monotonic_decreasing.
- Added support for Index.is_monotonic_increasingandIndex.is_monotonic_decreasing.
- Added support for pd.crosstab.
- Added support for pd.bdate_rangeand included business frequency support (B, BME, BMS, BQE, BQS, BYE, BYS) for bothpd.date_rangeandpd.bdate_range.
- Added support for lazy Indexobjects aslabelsinDataFrame.reindexandSeries.reindex.
- Added support for Series.dt.days,Series.dt.seconds,Series.dt.microseconds, andSeries.dt.nanoseconds.
- Added support for creating a DatetimeIndexfrom anIndexof numeric or string type.
- Added support for string indexing with Timedeltaobjects.
- Added support for Series.dt.total_secondsmethod.
- Added support for DataFrame.apply(axis=0).
- Added support for Series.dt.tz_convertandSeries.dt.tz_localize.
- Added support for DatetimeIndex.tz_convertandDatetimeIndex.tz_localize.
Improvements
- Improve concat, join performance when operations are performed on series coming from the same dataframe by avoiding unnecessary joins.
- Refactored quoted_identifier_to_snowflake_typeto avoid making metadata queries if the types have been cached locally.
- Improved pd.to_datetimeto handle all local input cases.
- Create a lazy index from another lazy index without pulling data to client.
- Raised NotImplementedErrorfor Index bitwise operators.
- Display a more clear error message when Index.namesis set to a non-like-like object.
- Raise a warning whenever MultiIndex values are pulled in locally.
- Improve warning message for pd.read_snowflakeinclude the creation reason when temp table creation is triggered.
- Improve performance for DataFrame.set_index, or settingDataFrame.indexorSeries.indexby avoiding checks require eager evaluation. As a consequence, when the new index that does not match the currentSeries/DataFrameobject length, aValueErroris no longer raised. Instead, when theSeries/DataFrameobject is longer than the provided index, theSeries/DataFrame's new index is filled withNaNvalues for the "extra" elements. Otherwise, the extra values in the provided index are ignored.
- Properly raise NotImplementedErrorwhen ambiguous/nonexistent are non-string inceil/floor/round.
Bug Fixes
- Stopped ignoring nanoseconds in pd.Timedeltascalars.
- Fixed AssertionError in tree of binary operations.
- Fixed bug in Series.dt.isocalendarusing a named Series
- Fixed inplaceargument for Series objects derived from DataFrame columns.
- Fixed a bug where Series.reindexandDataFrame.reindexdid not update the result index's name correctly.
- Fixed a bug where Series.takedid not error whenaxis=1was specified.
1.21.1 (2024-09-05)
Snowpark Python API Updates
Bug Fixes
- Fixed a bug where using to_pandas_batcheswith async jobs caused an error due to improper handling of waiting for asynchronous query completion.
1.21.0 (2024-08-19)
Snowpark Python API Updates
New Features
- Added support for snowflake.snowpark.testing.assert_dataframe_equalthat is a utility function to check the equality of two Snowpark DataFrames.
Improvements
- Added support server side string size limitations.
- Added support to create and invoke stored procedures, UDFs and UDTFs with optional arguments.
- Added support for column lineage in the DataFrame.lineage.trace API.
- Added support for passing INFER_SCHEMAoptions toDataFrameReaderviaINFER_SCHEMA_OPTIONS.
- Added support for passing parametersparameter toColumn.rlikeandColumn.regexp.
- Added support for automatically cleaning up temporary tables created by df.cache_result()in the current session, when the DataFrame is no longer referenced (i.e., gets garbage collected). It is still an experimental feature not enabled by default, and can be enabled by settingsession.auto_clean_up_temp_table_enabledtoTrue.
- Added support for string literals to the fmtparameter ofsnowflake.snowpark.functions.to_date.
- Added support for system$reference function.
Bug Fixes
- Fixed a bug where SQL generated for selecting *column has an incorrect subquery.
- Fixed a bug in DataFrame.to_pandas_batcheswhere the iterator could throw an error if certain transformation is made to the pandas dataframe due to wrong isolation level.
- Fixed a bug in DataFrame.lineage.traceto split the quoted feature view's name and version correctly.
- Fixed a bug in Column.isinthat caused invalid sql generation when passed an empty list.
- Fixed a bug that fails to raise NotImplementedError while setting cell with list like item.
Snowpark Local Testing Updates
New Features
- Added support for the following APIs:
- snowflake.snowpark.functions
- rank
- dense_rank
- percent_rank
- cume_dist
- ntile
- datediff
- array_agg
 
- snowflake.snowpark.column.Column.within_group
 
- snowflake.snowpark.functions
- Added support for parsing flags in regex statements for mocked plans. This maintains parity with the rlikeandregexpchanges above.
Bug Fixes
- Fixed a bug where Window Functions LEAD and LAG do not handle option ignore_nullsproperly.
- Fixed a bug where values were not populated into the result DataFrame during the insertion of table merge operation.
Improvements
- Fix pandas FutureWarning about integer indexing.
Snowpark pandas API Updates
New Features
- Added support for DataFrame.backfill,DataFrame.bfill,Series.backfill, andSeries.bfill.
- Added support for DataFrame.compareandSeries.comparewith default parameters.
- Added support for Series.dt.microsecondandSeries.dt.nanosecond.
- Added support for Index.is_uniqueandIndex.has_duplicates.
- Added support for Index.equals.
- Added support for Index.value_counts.
- Added support for Series.dt.day_nameandSeries.dt.month_name.
- Added support for indexing on Index, e.g., df.index[:10].
- Added support for DataFrame.unstackandSeries.unstack.
- Added support for DataFrame.asfreqandSeries.asfreq.
- Added support for Series.dt.is_month_startandSeries.dt.is_month_end.
- Added support for Index.allandIndex.any.
- Added support for Series.dt.is_year_startandSeries.dt.is_year_end.
- Added support for Series.dt.is_quarter_startandSeries.dt.is_quarter_end.
- Added support for lazy DatetimeIndex.
- Added support for Series.argmaxandSeries.argmin.
- Added support for Series.dt.is_leap_year.
- Added support for DataFrame.items.
- Added support for Series.dt.floorandSeries.dt.ceil.
- Added support for Index.reindex.
- Added support for DatetimeIndexproperties:year,month,day,hour,minute,second,microsecond,nanosecond,date,dayofyear,day_of_year,dayofweek,day_of_week,weekday,quarter,is_month_start,is_month_end,is_quarter_start,is_quarter_end,is_year_start,is_year_endandis_leap_year.
- Added support for Resampler.fillnaandResampler.bfill.
- Added limited support for the Timedeltatype, including creatingTimedeltacolumns andto_pandas.
- Added support for Index.argmaxandIndex.argmin.
Improvements
- Removed the public preview warning message when importing Snowpark pandas.
- Removed unnecessary count query from SnowflakeQueryCompiler.is_series_likemethod.
- Dataframe.columnsnow returns native pandas Index object instead of Snowpark Index object.
- Refactor and introduce query_compilerargument inIndexconstructor to createIndexfrom query compiler.
- pd.to_datetimenow returns a DatetimeIndex object instead of a Series object.
- pd.date_rangenow returns a DatetimeIndex object instead of a Series object.
Bug Fixes
- Made passing an unsupported aggregation function to pivot_tableraiseNotImplementedErrorinstead ofKeyError.
- Removed axis labels and callable names from error messages and telemetry about unsupported aggregations.
- Fixed AssertionError in Series.drop_duplicatesandDataFrame.drop_duplicateswhen called aftersort_values.
- Fixed a bug in Index.to_framewhere the result frame's column name may be wrong where name is unspecified.
- Fixed a bug where some Index docstrings are ignored.
- Fixed a bug in Series.reset_index(drop=True)where the result name may be wrong.
- Fixed a bug in Groupby.first/lastordering by the correct columns in the underlying window expression.
1.20.0 (2024-07-17)
Snowpark Python API Updates
Improvements
- Added distributed tracing using open telemetry APIs for table stored procedure function in DataFrame:- _execute_and_get_query_id
 
- Added support for the arrays_zipfunction.
- Improves performance for binary column expression and df._inby avoiding unnecessary cast for numeric values. You can enable this optimization by settingsession.eliminate_numeric_sql_value_cast_enabled = True.
- Improved error message for write_pandaswhen the target table does not exist andauto_create_table=False.
- Added open telemetry tracing on UDxF functions in Snowpark.
- Added open telemetry tracing on stored procedure registration in Snowpark.
- Added a new optional parameter called format_jsonto theSession.SessionBuilder.app_namefunction that sets the app name in theSession.query_tagin JSON format. By default, this parameter is set toFalse.
Bug Fixes
- Fixed a bug where SQL generated for lag(x, 0)was incorrect and failed with error messageargument 1 to function LAG needs to be constant, found 'SYSTEM$NULL_TO_FIXED(null)'.
Snowpark Local Testing Updates
New Features
- Added support for the following APIs:
- snowflake.snowpark.functions
- random
 
 
- snowflake.snowpark.functions
- Added new parameters to patchfunction when registering a mocked function:- distinctallows an alternate function to be specified for when a sql function should be distinct.
- pass_column_indexpasses a named parameter- column_indexto the mocked function that contains the pandas.Index for the input data.
- pass_row_indexpasses a named parameter- row_indexto the mocked function that is the 0 indexed row number the function is currently operating on.
- pass_input_datapasses a named parameter- input_datato the mocked function that contains the entire input dataframe for the current expression.
- Added support for the column_orderparameter to methodDataFrameWriter.save_as_table.
 
Bug Fixes
- Fixed a bug that caused DecimalType columns to be incorrectly truncated to integer precision when used in BinaryExpressions.
Snowpark pandas API Updates
New Features
- Added support for DataFrameGroupBy.all,SeriesGroupBy.all,DataFrameGroupBy.any, andSeriesGroupBy.any.
- Added support for DataFrame.nlargest,DataFrame.nsmallest,Series.nlargestandSeries.nsmallest.
- Added support for replaceandfrac > 1inDataFrame.sampleandSeries.sample.
- Added support for read_excel(Uses local pandas for processing)
- Added support for Series.at,Series.iat,DataFrame.at, andDataFrame.iat.
- Added support for Series.dt.isocalendar.
- Added support for Series.case_whenexcept when condition or replacement is callable.
- Added documentation pages for Indexand its APIs.
- Added support for DataFrame.assign.
- Added support for DataFrame.stack.
- Added support for DataFrame.pivotandpd.pivot.
- Added support for DataFrame.to_csvandSeries.to_csv.
- Added partial support for Series.str.translatewhere the values in thetableare single-codepoint strings.
- Added support for DataFrame.corr.
- Allow df.plot()andseries.plot()to be called, materializing the data into the local client
- Added support for DataFrameGroupByandSeriesGroupByaggregationsfirstandlast
- Added support for DataFrameGroupBy.get_group.
- Added support for limitparameter whenmethodparameter is used infillna.
- Added partial support for Series.str.translatewhere the values in thetableare single-codepoint strings.
- Added support for DataFrame.corr.
- Added support for DataFrame.equalsandSeries.equals.
- Added support for DataFrame.reindexandSeries.reindex.
- Added support for Index.astype.
- Added support for Index.uniqueandIndex.nunique.
- Added support for Index.sort_values.
Bug Fixes
- Fixed an issue when using np.where and df.where when the scalar 'other' is the literal 0.
- Fixed a bug regarding precision loss when converting to Snowpark pandas DataFrameorSerieswithdtype=np.uint64.
- Fixed bug where valuesis set toindexwhenindexandcolumnscontain all columns in DataFrame duringpivot_table.
Improvements
- Added support for Index.copy()
- Added support for Index APIs: dtype,values,item(),tolist(),to_series()andto_frame()
- Expand support for DataFrames with no rows in pd.pivot_tableandDataFrame.pivot_table.
- Added support for inplaceparameter inDataFrame.sort_indexandSeries.sort_index.
1.19.0 (2024-06-25)
Snowpark Python API Updates
New Features
- Added support for to_booleanfunction.
- Added documentation pages for Index and its APIs.
Bug Fixes
- Fixed a bug where python stored procedure with table return type fails when run in a task.
- Fixed a bug where df.dropna fails due to RecursionError: maximum recursion depth exceededwhen the DataFrame has more than 500 columns.
- Fixed a bug where AsyncJob.result("no_result")doesn't wait for the query to finish execution.
Snowpark Local Testing Updates
New Features
- Added support for the strictparameter when registering UDFs and Stored Procedures.
Bug Fixes
- Fixed a bug in convert_timezone that made the setting the source_timezone parameter return an error.
- Fixed a bug where creating DataFrame with empty data of type DateTyperaisesAttributeError.
- Fixed a bug that table merge fails when update clause exists but no update takes place.
- Fixed a bug in mock implementation of to_charthat raisesIndexErrorwhen incoming column has nonconsecutive row index.
- Fixed a bug in handling of CaseExprexpressions that raisesIndexErrorwhen incoming column has nonconsecutive row index.
- Fixed a bug in implementation of Column.likethat raisesIndexErrorwhen incoming column has nonconsecutive row index.
Improvements
- Added support for type coercion in the implementation of DataFrame.replace, DataFrame.dropna and the mock function iff.
Snowpark pandas API Updates
New Features
- Added partial support for DataFrame.pct_changeandSeries.pct_changewithout thefreqandlimitparameters.
- Added support for Series.str.get.
- Added support for Series.dt.dayofweek,Series.dt.day_of_week,Series.dt.dayofyear, andSeries.dt.day_of_year.
- Added support for Series.str.__getitem__(Series.str[...]).
- Added support for Series.str.lstripandSeries.str.rstrip.
- Added support for DataFrameGroupBy.sizeandSeriesGroupBy.size.
- Added support for DataFrame.expandingandSeries.expandingfor aggregationscount,sum,min,max,mean,std,var, andsemwithaxis=0.
- Added support for DataFrame.rollingandSeries.rollingfor aggregationcountwithaxis=0.
- Added support for Series.str.match.
- Added support for DataFrame.resampleandSeries.resamplefor aggregationssize,first, andlast.
- Added support for DataFrameGroupBy.all,SeriesGroupBy.all,DataFrameGroupBy.any, andSeriesGroupBy.any.
- Added support for DataFrame.nlargest,DataFrame.nsmallest,Series.nlargestandSeries.nsmallest.
- Added support for replaceandfrac > 1inDataFrame.sampleandSeries.sample.
- Added support for read_excel(Uses local pandas for processing)
- Added support for Series.at,Series.iat,DataFrame.at, andDataFrame.iat.
- Added support for Series.dt.isocalendar.
- Added support for Series.case_whenexcept when condition or replacement is callable.
- Added documentation pages for Indexand its APIs.
- Added support for DataFrame.assign.
- Added support for DataFrame.stack.
- Added support for DataFrame.pivotandpd.pivot.
- Added support for DataFrame.to_csvandSeries.to_csv.
- Added support for Index.T.
Bug Fixes
- Fixed a bug that causes output of GroupBy.aggregate's columns to be ordered incorrectly.
- Fixed a bug where DataFrame.describeon a frame with duplicate columns of differing dtypes could cause an error or incorrect results.
- Fixed a bug in DataFrame.rollingandSeries.rollingsowindow=0now throwsNotImplementedErrorinstead ofValueError
Improvements
- Added support for named aggregations in DataFrame.aggregateandSeries.aggregatewithaxis=0.
- pd.read_csvreads using the native pandas CSV parser, then uploads data to snowflake using parquet. This enables most of the parameters supported by- read_csvincluding date parsing and numeric conversions. Uploading via parquet is roughly twice as fast as uploading via CSV.
- Initial work to support an pd.Indexdirectly in Snowpark pandas. Support forpd.Indexas a first-class component of Snowpark pandas is coming soon.
- Added a lazy index constructor and support for len,shape,size,empty,to_pandas()andnames. Fordf.index, Snowpark pandas creates a lazy index object.
- For df.columns, Snowpark pandas supports a non-lazy version of anIndexsince the data is already stored locally.
1.18.0 (2024-05-28)
Snowpark Python API Updates
Improvements
- Improved error message to remind users set {"infer_schema": True}when reading csv file without specifying its schema.
- Improved error handling for Session.create_dataframewhen called with more than 512 rows and usingformatorpyformatparamstyle.
Snowpark pandas API Updates
New Features
- Added DataFrame.cache_resultandSeries.cache_resultmethods for users to persist DataFrames and Series to a temporary table lasting the duration of the session to improve latency of subsequent operations.
Bug Fixes
Improvements
- Added partial support for DataFrame.pivot_tablewith noindexparameter, as well as formarginsparameter.
- Updated the signature of DataFrame.shift/Series.shift/DataFrameGroupBy.shift/SeriesGroupBy.shiftto match pandas 2.2.1. Snowpark pandas does not yet support the newly-addedsuffixargument, or sequence values ofperiods.
- Re-added support for Series.str.split.
Bug Fixes
- Fixed how we support mixed columns for string methods (Series.str.*).
Snowpark Local Testing Updates
New Features
- Added support for the following DataFrameReader read options to file formats csvandjson:- PURGE
- PATTERN
- INFER_SCHEMA with value being False
- ENCODING with value being UTF8
 
- Added support for DataFrame.analytics.moving_aggandDataFrame.analytics.cumulative_agg_agg.
- Added support for if_not_existsparameter during UDF and stored procedure registration.
Bug Fixes
- Fixed a bug that when processing time format, fractional second part is not handled properly.
- Fixed a bug that caused function calls on *to fail.
- Fixed a bug that prevented creation of map and struct type objects.
- Fixed a bug that function date_addwas unable to handle some numeric types.
- Fixed a bug that TimestampTypecasting resulted in incorrect data.
- Fixed a bug that caused DecimalTypedata to have incorrect precision in some cases.
- Fixed a bug where referencing missing table or view raises confusing IndexError.
- Fixed a bug that mocked function to_timestamp_ntzcan not handle None data.
- Fixed a bug that mocked UDFs handles output data of None improperly.
- Fixed a bug where DataFrame.with_column_renamedignores attributes from parent DataFrames after join operations.
- Fixed a bug that integer precision of large value gets lost when converted to pandas DataFrame.
- Fixed a bug that the schema of datetime object is wrong when create DataFrame from a pandas DataFrame.
- Fixed a bug in the implementation of Column.equal_nanwhere null data is handled incorrectly.
- Fixed a bug where DataFrame.dropignore attributes from parent DataFrames after join operations.
- Fixed a bug in mocked function date_partwhere Column type is set wrong.
- Fixed a bug where DataFrameWriter.save_as_tabledoes not raise exceptions when inserting null data into non-nullable columns.
- Fixed a bug in the implementation of DataFrameWriter.save_as_tablewhere- Append or Truncate fails when incoming data has different schema than existing table.
- Truncate fails when incoming data does not specify columns that are nullable.
 
Improvements
- Removed dependency check for pyarrowas it is not used.
- Improved target type coverage of Column.cast, adding support for casting to boolean and all integral types.
- Aligned error experience when calling UDFs and stored procedures.
- Added appropriate error messages for is_permanentandanonymousoptions in UDFs and stored procedures registration to make it more clear that those features are not yet supported.
- File read operation with unsupported options and values now raises NotImplementedErrorinstead of warnings and unclear error information.
1.17.0 (2024-05-21)
Snowpark Python API Updates
New Features
- Added support to add a comment on tables and views using the functions listed below:
- DataFrameWriter.save_as_table
- DataFrame.create_or_replace_view
- DataFrame.create_or_replace_temp_view
- DataFrame.create_or_replace_dynamic_table
 
Improvements
- Improved error message to remind users to set {"infer_schema": True}when reading CSV file without specifying its schema.
Snowpark pandas API Updates
New Features
- Start of Public Preview of Snowpark pandas API. Refer to the Snowpark pandas API Docs for more details.
Snowpark Local Testing Updates
New Features
- Added support for NumericType and VariantType data conversion in the mocked function to_timestamp_ltz,to_timestamp_ntz,to_timestamp_tzandto_timestamp.
- Added support for DecimalType, BinaryType, ArrayType, MapType, TimestampType, DateType and TimeType data conversion in the mocked function to_char.
- Added support for the following APIs:
- snowflake.snowpark.functions:
- to_varchar
 
- snowflake.snowpark.DataFrame:
- pivot
 
- snowflake.snowpark.Session:
- cancel_all
 
 
- snowflake.snowpark.functions:
- Introduced a new exception class snowflake.snowpark.mock.exceptions.SnowparkLocalTestingException.
- Added support for casting to FloatType
Bug Fixes
- Fixed a bug that stored procedure and UDF should not remove imports already in the sys.pathduring the clean-up step.
- Fixed a bug that when processing datetime format, the fractional second part is not handled properly.
- Fixed a bug that on Windows platform that file operations was unable to properly handle file separator in directory name.
- Fixed a bug that on Windows platform that when reading a pandas dataframe, IntervalType column with integer data can not be processed.
- Fixed a bug that prevented users from being able to select multiple columns with the same alias.
- Fixed a bug that Session.get_current_[schema|database|role|user|account|warehouse]returns upper-cased identifiers when identifiers are quoted.
- Fixed a bug that function substrandsubstringcan not handle 0-basedstart_expr.
Improvements
- Standardized the error experience by raising SnowparkLocalTestingExceptionin error cases which is on par withSnowparkSQLExceptionraised in non-local execution.
- Improved error experience of Session.write_pandasmethod thatNotImplementErrorwill be raised when called.
- Aligned error experience with reusing a closed session in non-local execution.
1.16.0 (2024-05-07)
New Features
- Support stored procedure register with packages given as Python modules.
- Added snowflake.snowpark.Session.lineage.trace to explore data lineage of snowfake objects.
- Added support for structured type schema parsing.
Bug Fixes
- Fixed a bug when inferring schema, single quotes are added to stage files already have single quotes.
Local Testing Updates
New Features
- Added support for StringType, TimestampType and VariantType data conversion in the mocked function to_date.
- Added support for the following APIs:
- snowflake.snowpark.functions
- get
- concat
- concat_ws
 
 
- snowflake.snowpark.functions
Bug Fixes
- Fixed a bug that caused NaTandNaNvalues to not be recognized.
- Fixed a bug where, when inferring a schema, single quotes were added to stage files that already had single quotes.
- Fixed a bug where DataFrameReader.csvwas unable to handle quoted values containing a delimiter.
- Fixed a bug that when there is Nonevalue in an arithmetic calculation, the output should remainNoneinstead ofmath.nan.
- Fixed a bug in function sumandcovar_popthat when there ismath.nanin the data, the output should also bemath.nan.
- Fixed a bug that stage operation can not handle directories.
- Fixed a bug that DataFrame.to_pandasshould take Snowflake numeric types with precision 38 asint64.
1.15.0 (2024-04-24)
New Features
- Added truncatesave mode inDataFrameWriteto overwrite existing tables by truncating the underlying table instead of dropping it.
- Added telemetry to calculate query plan height and number of duplicate nodes during collect operations.
- Added the functions below to unload data from a DataFrameinto one or more files in a stage:- DataFrame.write.json
- DataFrame.write.csv
- DataFrame.write.parquet
 
- Added distributed tracing using open telemetry APIs for action functions in DataFrameandDataFrameWriter:- snowflake.snowpark.DataFrame:
- collect
- collect_nowait
- to_pandas
- count
- show
 
- snowflake.snowpark.DataFrameWriter:
- save_as_table
 
 
- snowflake.snowpark.DataFrame:
- Added support for snow:// URLs to snowflake.snowpark.Session.file.getandsnowflake.snowpark.Session.file.get_stream
- Added support to register stored procedures and UDxFs with a comment.
- UDAF client support is ready for public preview. Please stay tuned for the Snowflake announcement of UDAF public preview.
- Added support for dynamic pivot. This feature is currently in private preview.
Improvements
- Improved the generated query performance for both compilation and execution by converting duplicate subqueries to Common Table Expressions (CTEs). It is still an experimental feature not enabled by default, and can be enabled by setting session.cte_optimization_enabledtoTrue.
Bug Fixes
- Fixed a bug where statement_paramswas not passed to query executions that register stored procedures and user defined functions.
- Fixed a bug causing snowflake.snowpark.Session.file.get_streamto fail for quoted stage locations.
- Fixed a bug that an internal type hint in utils.pymight raise AttributeError in case the underlying module can not be found.
Local Testing Updates
New Features
- Added support for registering UDFs and stored procedures.
- Added support for the following APIs:
- snowflake.snowpark.Session:
- file.put
- file.put_stream
- file.get
- file.get_stream
- read.json
- add_import
- remove_import
- get_imports
- clear_imports
- add_packages
- add_requirements
- clear_packages
- remove_package
- udf.register
- udf.register_from_file
- sproc.register
- sproc.register_from_file
 
- snowflake.snowpark.functions
- current_database
- current_session
- date_trunc
- object_construct
- object_construct_keep_null
- pow
- sqrt
- udf
- sproc
 
 
- snowflake.snowpark.Session:
- Added support for StringType, TimestampType and VariantType data conversion in the mocked function to_time.
Bug Fixes
- Fixed a bug that null filled columns for constant functions.
- Fixed a bug that implementation of to_object, to_array and to_binary to better handle null inputs.
- Fixed a bug that timestamp data comparison can not handle year beyond 2262.
- Fixed a bug that Session.builder.getOrCreateshould return the created mock session.
1.14.0 (2024-03-20)
New Features
- Added support for creating vectorized UDTFs with processmethod.
- Added support for dataframe functions:
- to_timestamp_ltz
- to_timestamp_ntz
- to_timestamp_tz
- locate
 
- Added support for ASOF JOIN type.
- Added support for the following local testing APIs:
- snowflake.snowpark.functions:
- to_double
- to_timestamp
- to_timestamp_ltz
- to_timestamp_ntz
- to_timestamp_tz
- greatest
- least
- convert_timezone
- dateadd
- date_part
 
- snowflake.snowpark.Session:
- get_current_account
- get_current_warehouse
- get_current_role
- use_schema
- use_warehouse
- use_database
- use_role
 
 
- snowflake.snowpark.functions:
Bug Fixes
- Fixed a bug in SnowflakePlanBuilderthatsave_as_tabledoes not filter column that name start with '$' and follow by number correctly.
- Fixed a bug that statement parameters may have no effect when resolving imports and packages.
- Fixed bugs in local testing:
- LEFT ANTI and LEFT SEMI joins drop rows with null values.
- DataFrameReader.csv incorrectly parses data when the optional parameter field_optionally_enclosed_byis specified.
- Column.regexp only considers the first entry when patternis aColumn.
- Table.update raises KeyErrorwhen updating null values in the rows.
- VARIANT columns raise errors at DataFrame.collect.
- count_distinctdoes not work correctly when counting.
- Null values in integer columns raise TypeError.
 
Improvements
- Added telemetry to local testing.
- Improved the error message of DataFrameReaderto raiseFileNotFounderror when reading a path that does not exist or when there are no files under the path.
1.13.0 (2024-02-26)
New Features
- Added support for an optional date_partargument in functionlast_day.
- SessionBuilder.app_namewill set the query_tag after the session is created.
- Added support for the following local testing functions:
- current_timestamp
- current_date
- current_time
- strip_null_value
- upper
- lower
- length
- initcap
 
Improvements
- Added cleanup logic at interpreter shutdown to close all active sessions.
- Closing sessions within stored procedures now is a no-op logging a warning instead of raising an error.
Bug Fixes
- Fixed a bug in DataFrame.to_local_iteratorwhere the iterator could yield wrong results if another query is executed before the iterator finishes due to wrong isolation level. For details, please see #945.
- Fixed a bug that truncated table names in error messages while running a plan with local testing enabled.
- Fixed a bug that Session.rangereturns empty result when the range is large.
1.12.1 (2024-02-08)
Improvements
- Use split_blocks=Trueby default duringto_pandasconversion, for optimal memory allocation. This parameter is passed topyarrow.Table.to_pandas, which enablesPyArrowto split the memory allocation into smaller, more manageable blocks instead of allocating a single contiguous block. This results in better memory management when dealing with larger datasets.
Bug Fixes
- Fixed a bug in DataFrame.to_pandasthat caused an error when evaluating on a Dataframe with anIntergerTypecolumn with null values.
1.12.0 (2024-01-30)
New Features
- Exposed statement_paramsinStoredProcedure.__call__.
- Added two optional arguments to Session.add_import.- chunk_size: The number of bytes to hash per chunk of the uploaded files.
- whole_file_hash: By default only the first chunk of the uploaded import is hashed to save time. When this is set to True each uploaded file is fully hashed instead.
 
- Added parameters external_access_integrationsandsecretswhen creating a UDAF from Snowpark Python to allow integration with external access.
- Added a new method Session.append_query_tag. Allows an additional tag to be added to the current query tag by appending it as a comma separated value.
- Added a new method Session.update_query_tag. Allows updates to a JSON encoded dictionary query tag.
- SessionBuilder.getOrCreatewill now attempt to replace the singleton it returns when token expiration has been detected.
- Added support for new functions in snowflake.snowpark.functions:- array_except
- create_map
- sign/- signum
 
- Added the following functions to DataFrame.analytics:- Added the moving_aggfunction inDataFrame.analyticsto enable moving aggregations like sums and averages with multiple window sizes.
- Added the cummulative_aggfunction inDataFrame.analyticsto enable commulative aggregations like sums and averages on multiple columns.
- Added the compute_lagandcompute_leadfunctions inDataFrame.analyticsfor enabling lead and lag calculations on multiple columns.
- Added the time_series_aggfunction inDataFrame.analyticsto enable time series aggregations like sums and averages with multiple time windows.
 
- Added the 
Bug Fixes
- 
Fixed a bug in DataFrame.na.fillthat caused Boolean values to erroneously override integer values.
- 
Fixed a bug in Session.create_dataframewhere the Snowpark DataFrames created using pandas DataFrames were not inferring the type for timestamp columns correctly. The behavior is as follows:- Earlier timestamp columns without a timezone would be converted to nanosecond epochs and inferred as LongType(), but will now be correctly maintained as timestamp values and be inferred asTimestampType(TimestampTimeZone.NTZ).
- Earlier timestamp columns with a timezone would be inferred as TimestampType(TimestampTimeZone.NTZ)and loose timezone information but will now be correctly inferred asTimestampType(TimestampTimeZone.LTZ)and timezone information is retained correctly.
- Set session parameter PYTHON_SNOWPARK_USE_LOGICAL_TYPE_FOR_CREATE_DATAFRAMEto revert back to old behavior. It is recommended that you update your code to align with correct behavior because the parameter will be removed in the future.
 
- Earlier timestamp columns without a timezone would be converted to nanosecond epochs and inferred as 
- 
Fixed a bug that DataFrame.to_pandasgets decimal type when scale is not 0, and creates an object dtype inpandas. Instead, we cast the value to a float64 type.
- 
Fixed bugs that wrongly flattened the generated SQL when one of the following happens: - DataFrame.filter()is called after- DataFrame.sort().limit().
- DataFrame.sort()or- filter()is called on a DataFrame that already has a window function or sequence-dependent data generator column. For instance,- df.select("a", seq1().alias("b")).select("a", "b").sort("a")won't flatten the sort clause anymore.
- a window or sequence-dependent data generator column is used after DataFrame.limit(). For instance,df.limit(10).select(row_number().over())won't flatten the limit and select in the generated SQL.
 
- 
Fixed a bug where aliasing a DataFrame column raised an error when the DataFame was copied from another DataFrame with an aliased column. For instance, df = df.select(col("a").alias("b")) df = copy(df) df.select(col("b").alias("c")) # threw an error. Now it's fixed. 
- 
Fixed a bug in Session.create_dataframethat the non-nullable field in a schema is not respected for boolean type. Note that this fix is only effective when the user has the privilege to create a temp table.
- 
Fixed a bug in SQL simplifier where non-select statements in session.sqldropped a SQL query when used withlimit().
- 
Fixed a bug that raised an exception when session parameter ERROR_ON_NONDETERMINISTIC_UPDATEis true.
Behavior Changes (API Compatible)
- When parsing data types during a to_pandasoperation, we rely on GS precision value to fix precision issues for large integer values. This may affect users where a column that was earlier returned asint8gets returned asint64. Users can fix this by explicitly specifying precision values for their return column.
- Aligned behavior for Session.callin case of table stored procedures where runningSession.callwould not trigger stored procedure unless acollect()operation was performed.
- StoredProcedureRegistrationwill now automatically add- snowflake-snowpark-pythonas a package dependency. The added dependency will be on the client's local version of the library and an error is thrown if the server cannot support that version.
1.11.1 (2023-12-07)
Bug Fixes
- Fixed a bug that numpy should not be imported at the top level of mock module.
- Added support for these new functions in snowflake.snowpark.functions:- from_utc_timestamp
- to_utc_timestamp
 
1.11.0 (2023-12-05)
New Features
- 
Add the conn_errorattribute toSnowflakeSQLExceptionthat stores the whole underlying exception fromsnowflake-connector-python.
- 
Added support for RelationalGroupedDataframe.pivot()to accesspivotin the following patternDataframe.group_by(...).pivot(...).
- 
Added experimental feature: Local Testing Mode, which allows you to create and operate on Snowpark Python DataFrames locally without connecting to a Snowflake account. You can use the local testing framework to test your DataFrame operations locally, on your development machine or in a CI (continuous integration) pipeline, before deploying code changes to your account. 
- 
Added support for arrays_to_objectnew functions insnowflake.snowpark.functions.
- 
Added support for the vector data type. 
Dependency Updates
- Bumped cloudpickle dependency to work with cloudpickle==2.2.1
- Updated snowflake-connector-pythonto3.4.0.
Bug Fixes
- DataFrame column names quoting check now supports newline characters.
- Fix a bug where a DataFrame generated by session.read.with_metadatacreates inconsistent table when doingdf.write.save_as_table.
1.10.0 (2023-11-03)
New Features
- Added support for managing case sensitivity in DataFrame.to_local_iterator().
- Added support for specifying vectorized UDTF's input column names by using the optional parameter input_namesinUDTFRegistration.register/register_fileandfunctions.pandas_udtf. By default,RelationalGroupedDataFrame.applyInPandaswill infer the column names from current dataframe schema.
- Add sql_error_codeandraw_messageattributes toSnowflakeSQLExceptionwhen it is caused by a SQL exception.
Bug Fixes
- Fixed a bug in DataFrame.to_pandas()where converting snowpark dataframes to pandas dataframes was losing precision on integers with more than 19 digits.
- Fixed a bug that session.add_packagescan not handle requirement specifier that contains project name with underscore and version.
- Fixed a bug in DataFrame.limit()whenoffsetis used and the parentDataFrameuseslimit. Now theoffsetwon't impact the parent DataFrame'slimit.
- Fixed a bug in DataFrame.write.save_as_tablewhere dataframes created from read api could not save data into snowflake because of invalid column name$1.
Behavior change
- Changed the behavior of date_format:- The formatargument changed from optional to required.
- The returned result changed from a date object to a date-formatted string.
 
- The 
- When a window function, or a sequence-dependent data generator (normal,zipf,uniform,seq1,seq2,seq4,seq8) function is used, the sort and filter operation will no longer be flattened when generating the query.
1.9.0 (2023-10-13)
New Features
- Added support for the Python 3.11 runtime environment.
Dependency updates
- Added back the dependency of typing-extensions.
Bug Fixes
- Fixed a bug where imports from permanent stage locations were ignored for temporary stored procedures, UDTFs, UDFs, and UDAFs.
- Revert back to using CTAS (create table as select) statement for Dataframe.writer.save_as_tablewhich does not need insert permission for writing tables.
New Features
- Support PythonObjJSONEncoderjson-serializable objects forARRAYandOBJECTliterals.
1.8.0 (2023-09-14)
New Features
- 
Added support for VOLATILE/IMMUTABLE keyword when registering UDFs. 
- 
Added support for specifying clustering keys when saving dataframes using DataFrame.save_as_table.
- 
Accept Iterableobjects input forschemawhen creating dataframes usingSession.create_dataframe.
- 
Added the property DataFrame.sessionto return aSessionobject.
- 
Added the property Session.session_idto return an integer that represents session ID.
- 
Added the property Session.connectionto return aSnowflakeConnectionobject .
- 
Added support for creating a Snowpark session from a configuration file or environment variables. 
Dependency updates
- Updated snowflake-connector-pythonto 3.2.0.
Bug Fixes
- Fixed a bug where automatic package upload would raise ValueErroreven when compatible package version were added insession.add_packages.
- Fixed a bug where table stored procedures were not registered correctly when using register_from_file.
- Fixed a bug where dataframe joins failed with invalid_identifiererror.
- Fixed a bug where DataFrame.copydisables SQL simplfier for the returned copy.
- Fixed a bug where session.sql().select()would fail if any parameters are specified tosession.sql()
1.7.0 (2023-08-28)
New Features
- Added parameters external_access_integrationsandsecretswhen creating a UDF, UDTF or Stored Procedure from Snowpark Python to allow integration with external access.
- Added support for these new functions in snowflake.snowpark.functions:- array_flatten
- flatten
 
- Added support for apply_in_pandasinsnowflake.snowpark.relational_grouped_dataframe.
- Added support for replicating your local Python environment on Snowflake via Session.replicate_local_environment.
Bug Fixes
- Fixed a bug where session.create_dataframefails to properly set nullable columns where nullability was affected by order or data was given.
- Fixed a bug where DataFrame.selectcould not identify and alias columns in presence of table functions when output columns of table function overlapped with columns in dataframe.
Behavior Changes
- When creating stored procedures, UDFs, UDTFs, UDAFs with parameter is_permanent=Falsewill now create temporary objects even whenstage_nameis provided. The default value ofis_permanentisFalsewhich is why if this value is not explicitly set toTruefor permanent objects, users will notice a change in behavior.
- types.StructFieldnow enquotes column identifier by default.
1.6.1 (2023-08-02)
New Features
- Added support for these new functions in snowflake.snowpark.functions:- array_sort
- sort_array
- array_min
- array_max
- explode_outer
 
- Added support for pure Python packages specified via Session.add_requirementsorSession.add_packages. They are now usable in stored procedures and UDFs even if packages are not present on the Snowflake Anaconda channel.- Added Session parameter custom_packages_upload_enabledandcustom_packages_force_upload_enabledto enable the support for pure Python packages feature mentioned above. Both parameters default toFalse.
 
- Added Session parameter 
- Added support for specifying package requirements by passing a Conda environment yaml file to Session.add_requirements.
- Added support for asynchronous execution of multi-query dataframes that contain binding variables.
- Added support for renaming multiple columns in DataFrame.rename.
- Added support for Geometry datatypes.
- Added support for paramsinsession.sql()in stored procedures.
- Added support for user-defined aggregate functions (UDAFs). This feature is currently in private preview.
- Added support for vectorized UDTFs (user-defined table functions). This feature is currently in public preview.
- Added support for Snowflake Timestamp variants (i.e., TIMESTAMP_NTZ,TIMESTAMP_LTZ,TIMESTAMP_TZ)- Added TimestampTimezoneas an argument inTimestampTypeconstructor.
- Added type hints NTZ,LTZ,TZandTimestampto annotate functions when registering UDFs.
 
- Added 
Improvements
- Removed redundant dependency typing-extensions.
- DataFrame.cache_resultnow creates temp table fully qualified names under current database and current schema.
Bug Fixes
- Fixed a bug where type check happens on pandas before it is imported.
- Fixed a bug when creating a UDF from numpy.ufunc.
- Fixed a bug where DataFrame.unionwas not generating the correctSelectable.schema_querywhen SQL simplifier is enabled.
Behavior Changes
- DataFrameWriter.save_as_tablenow respects the- nullablefield of the schema provided by the user or the inferred schema based on data from user input.
Dependency updates
- Updated snowflake-connector-pythonto 3.0.4.
1.5.1 (2023-06-20)
New Features
- Added support for the Python 3.10 runtime environment.
1.5.0 (2023-06-09)
Behavior Changes
- Aggregation results, from functions such as DataFrame.aggandDataFrame.describe, no longer strip away non-printing characters from column names.
New Features
- Added support for the Python 3.9 runtime environment.
- Added support for new functions in snowflake.snowpark.functions:- array_generate_range
- array_unique_agg
- collect_set
- sequence
 
- Added support for registering and calling stored procedures with TABLEreturn type.
- Added support for parameter lengthinStringType()to specify the maximum number of characters that can be stored by the column.
- Added the alias functions.element_at()forfunctions.get().
- Added the alias Column.containsforfunctions.contains.
- Added experimental feature DataFrame.alias.
- Added support for querying metadata columns from stage when creating DataFrameusingDataFrameReader.
- Added support for StructType.addto append more fields to existingStructTypeobjects.
- Added support for parameter execute_asinStoredProcedureRegistration.register_from_file()to specify stored procedure caller rights.
Bug Fixes
- Fixed a bug where the Dataframe.join_table_functiondid not run all of the necessary queries to set up the join table function when SQL simplifier was enabled.
- Fixed type hint declaration for custom types - ColumnOrName,ColumnOrLiteralStr,ColumnOrSqlExpr,LiteralTypeandColumnOrLiteralthat were breakingmypychecks.
- Fixed a bug where DataFrameWriter.save_as_tableandDataFrame.copy_into_tablefailed to parse fully qualified table names.
1.4.0 (2023-04-24)
New Features
- Added support for session.getOrCreate.
- Added support for alias Column.getField.
- Added support for new functions in snowflake.snowpark.functions:- date_addand- date_subto make add and subtract operations easier.
- daydiff
- explode
- array_distinct.
- regexp_extract.
- struct.
- format_number.
- bround.
- substring_index
 
- Added parameter skip_upload_on_content_matchwhen creating UDFs, UDTFs and stored procedures usingregister_from_fileto skip uploading files to a stage if the same version of the files are already on the stage.
- Added support for DataFrameWriter.save_as_tablemethod to take table names that contain dots.
- Flattened generated SQL when DataFrame.filter()orDataFrame.order_by()is followed by a projection statement (e.g.DataFrame.select(),DataFrame.with_column()).
- Added support for creating dynamic tables (in private preview) using Dataframe.create_or_replace_dynamic_table.
- Added an optional argument paramsinsession.sql()to support binding variables. Note that this is not supported in stored procedures yet.
Bug Fixes
- Fixed a bug in strtok_to_arraywhere an exception was thrown when a delimiter was passed in.
- Fixed a bug in session.add_importwhere the module had the same namespace as other dependencies.
1.3.0 (2023-03-28)
New Features
- Added support for delimitersparameter infunctions.initcap().
- Added support for functions.hash()to accept a variable number of input expressions.
- Added API Session.RuntimeConfigfor getting/setting/checking the mutability of any runtime configuration.
- Added support managing case sensitivity in Rowresults fromDataFrame.collectusingcase_sensitiveparameter.
- Added API Session.conffor getting, setting or checking the mutability of any runtime configuration.
- Added support for managing case sensitivity in Rowresults fromDataFrame.collectusingcase_sensitiveparameter.
- Added indexer support for snowflake.snowpark.types.StructType.
- Added a keyword argument log_on_exceptiontoDataframe.collectandDataframe.collect_no_waitto optionally disable error logging for SQL exceptions.
Bug Fixes
- Fixed a bug where a DataFrame set operation(DataFrame.substract,DataFrame.union, etc.) being called after another DataFrame set operation andDataFrame.selectorDataFrame.with_columnthrows an exception.
- Fixed a bug where chained sort statements are overwritten by the SQL simplifier.
Improvements
- Simplified JOIN queries to use constant subquery aliases (SNOWPARK_LEFT,SNOWPARK_RIGHT) by default. Users can disable this at runtime withsession.conf.set('use_constant_subquery_alias', False)to use randomly generated alias names instead.
- Allowed specifying statement parameters in session.call().
- Enabled the uploading of large pandas DataFrames in stored procedures by defaulting to a chunk size of 100,000 rows.
1.2.0 (2023-03-02)
New Features
- Added support for displaying source code as comments in the generated scripts when registering stored procedures. This
is enabled by default, turn off by specifying source_code_display=Falseat registration.
- Added a parameter if_not_existswhen creating a UDF, UDTF or Stored Procedure from Snowpark Python to ignore creating the specified function or procedure if it already exists.
- Accept integers when calling snowflake.snowpark.functions.getto extract value from array.
- Added functions.reversein functions to open access to Snowflake built-in function reverse.
- Added parameter require_scoped_urlin snowflake.snowflake.files.SnowflakeFile.open()(in Private Preview)to replaceis_owner_fileis marked for deprecation.
Bug Fixes
- Fixed a bug that overwrote paramstyletoqmarkwhen creating a Snowpark session.
- Fixed a bug where df.join(..., how="cross")fails withSnowparkJoinException: (1112): Unsupported using join type 'Cross'.
- Fixed a bug where querying a DataFramecolumn created from chained function calls used a wrong column name.
1.1.0 (2023-01-26)
New Features:
- Added asc,asc_nulls_first,asc_nulls_last,desc,desc_nulls_first,desc_nulls_last,date_partandunix_timestampin functions.
- Added the property DataFrame.dtypesto return a list of column name and data type pairs.
- Added the following aliases:
- functions.expr()for- functions.sql_expr().
- functions.date_format()for- functions.to_date().
- functions.monotonically_increasing_id()for- functions.seq8()
- functions.from_unixtime()for- functions.to_timestamp()
 
Bug Fixes:
- Fixed a bug in SQL simplifier that didn’t handle Column alias and join well in some cases. See https://github.com/snowflakedb/snowpark-python/issues/658 for details.
- Fixed a bug in SQL simplifier that generated wrong column names for function calls, NaN and INF.
Improvements
- The session parameter PYTHON_SNOWPARK_USE_SQL_SIMPLIFIERisTrueafter Snowflake 7.3 was released. In snowpark-python,session.sql_simplifier_enabledreads the value ofPYTHON_SNOWPARK_USE_SQL_SIMPLIFIERby default, meaning that the SQL simplfier is enabled by default after the Snowflake 7.3 release. To turn this off, setPYTHON_SNOWPARK_USE_SQL_SIMPLIFIERin Snowflake toFalseor runsession.sql_simplifier_enabled = Falsefrom Snowpark. It is recommended to use the SQL simplifier because it helps to generate more concise SQL.
1.0.0 (2022-11-01)
New Features
- Added Session.generator()to create a newDataFrameusing the Generator table function.
- Added a parameter secureto the functions that create a secure UDF or UDTF.
0.12.0 (2022-10-14)
New Features
- Added new APIs for async job:
- Session.create_async_job()to create an- AsyncJobinstance from a query id.
- AsyncJob.result()now accepts argument- result_typeto return the results in different formats.
- AsyncJob.to_df()returns a- DataFramebuilt from the result of this asynchronous job.
- AsyncJob.query()returns the SQL text of the executed query.
 
- DataFrame.agg()and- RelationalGroupedDataFrame.agg()now accept variable-length arguments.
- Added parameters lsuffixandrsuffixtoDataFram.join()andDataFrame.cross_join()to conveniently rename overlapping columns.
- Added Table.drop_table()so you can drop the temp table afterDataFrame.cache_result().Tableis also a context manager so you can use thewithstatement to drop the cache temp table after use.
- Added Session.use_secondary_roles().
- Added functions first_value()andlast_value(). (contributed by @chasleslr)
- Added onas an alias forusing_columnsandhowas an alias forjoin_typeinDataFrame.join().
Bug Fixes
- Fixed a bug in Session.create_dataframe()that raised an error whenschemanames had special characters.
- Fixed a bug in which options set in Session.read.option()were not passed toDataFrame.copy_into_table()as default values.
- Fixed a bug in which DataFrame.copy_into_table()raises an error when a copy option has single quotes in the value.
0.11.0 (2022-09-28)
Behavior Changes
- Session.add_packages()now raises- ValueErrorwhen the version of a package cannot be found in Snowflake Anaconda channel. Previously,- Session.add_packages()succeeded, and a- SnowparkSQLExceptionexception was raised later in the UDF/SP registration step.
New Features:
- Added method FileOperation.get_stream()to support downloading stage files as stream.
- Added support in functions.ntiles()to accept int argument.
- Added the following aliases:
- functions.call_function()for- functions.call_builtin().
- functions.function()for- functions.builtin().
- DataFrame.order_by()for- DataFrame.sort()
- DataFrame.orderBy()for- DataFrame.sort()
 
- Improved DataFrame.cache_result()to return a more accurateTableclass instead of aDataFrameclass.
- Added support to allow sessionas the first argument when callingStoredProcedure.
Improvements
- Improved nested query generation by flattening queries when applicable.
- This improvement could be enabled by setting Session.sql_simplifier_enabled = True.
- DataFrame.select(),- DataFrame.with_column(),- DataFrame.drop()and other select-related APIs have more flattened SQLs.
- DataFrame.union(),- DataFrame.union_all(),- DataFrame.except_(),- DataFrame.intersect(),- DataFrame.union_by_name()have flattened SQLs generated when multiple set operators are chained.
 
- This improvement could be enabled by setting 
- Improved type annotations for async job APIs.
Bug Fixes
- Fixed a bug in which Table.update(),Table.delete(),Table.merge()try to reference a temp table that does not exist.
0.10.0 (2022-09-16)
New Features:
- Added experimental APIs for evaluating Snowpark dataframes with asynchronous queries:
- Added keyword argument blockto the following action APIs on Snowpark dataframes (which execute queries) to allow asynchronous evaluations:- DataFrame.collect(),- DataFrame.to_local_iterator(),- DataFrame.to_pandas(),- DataFrame.to_pandas_batches(),- DataFrame.count(),- DataFrame.first().
- DataFrameWriter.save_as_table(),- DataFrameWriter.copy_into_location().
- Table.delete(),- Table.update(),- Table.merge().
 
- Added method DataFrame.collect_nowait()to allow asynchronous evaluations.
- Added class AsyncJobto retrieve results from asynchronously executed queries and check their status.
 
- Added keyword argument 
- Added support for table_typeinSession.write_pandas(). You can now choose from thesetable_typeoptions:"temporary","temp", and"transient".
- Added support for using Python structured data (list,tupleanddict) as literal values in Snowpark.
- Added keyword argument execute_astofunctions.sproc()andsession.sproc.register()to allow registering a stored procedure as a caller or owner.
- Added support for specifying a pre-configured file format when reading files from a stage in Snowflake.
Improvements:
- Added support for displaying details of a Snowpark session.
Bug Fixes:
- Fixed a bug in which DataFrame.copy_into_table()andDataFrameWriter.save_as_table()mistakenly created a new table if the table name is fully qualified, and the table already exists.
Deprecations:
- Deprecated keyword argument create_temp_tableinSession.write_pandas().
- Deprecated invoking UDFs using arguments wrapped in a Python list or tuple. You can use variable-length arguments without a list or tuple.
Dependency updates
- Updated snowflake-connector-pythonto 2.7.12.
0.9.0 (2022-08-30)
New Features:
- Added support for displaying source code as comments in the generated scripts when registering UDFs.
This feature is turned on by default. To turn it off, pass the new keyword argument source_code_displayasFalsewhen callingregister()or@udf().
- Added support for calling table functions from DataFrame.select(),DataFrame.with_column()andDataFrame.with_columns()which now take parameters of typetable_function.TableFunctionCallfor columns.
- Added keyword argument overwritetosession.write_pandas()to allow overwriting contents of a Snowflake table with that of a pandas DataFrame.
- Added keyword argument column_ordertodf.write.save_as_table()to specify the matching rules when inserting data into table in append mode.
- Added method FileOperation.put_stream()to upload local files to a stage via file stream.
- Added methods TableFunctionCall.alias()andTableFunctionCall.as_()to allow aliasing the names of columns that come from the output of table function joins.
- Added function get_active_session()in modulesnowflake.snowpark.contextto get the current active Snowpark session.
Bug Fixes:
- Fixed a bug in which batch insert should not raise an error when statement_paramsis not passed to the function.
- Fixed a bug in which column names should be quoted when session.create_dataframe()is called with dicts and a given schema.
- Fixed a bug in which creation of table should be skipped if the table already exists and is in append mode when calling df.write.save_as_table().
- Fixed a bug in which third-party packages with underscores cannot be added when registering UDFs.
Improvements:
- Improved function function.uniform()to infer the types of inputsmax_andmin_and cast the limits toIntegerTypeorFloatTypecorrespondingly.
0.8.0 (2022-07-22)
New Features:
- Added keyword only argument statement_paramsto the following methods to allow for specifying statement level parameters:- collect,- to_local_iterator,- to_pandas,- to_pandas_batches,- count,- copy_into_table,- show,- create_or_replace_view,- create_or_replace_temp_view,- first,- cache_resultand- random_spliton class- snowflake.snowpark.Dateframe.
- update,- deleteand- mergeon class- snowflake.snowpark.Table.
- save_as_tableand- copy_into_locationon class- snowflake.snowpark.DataFrameWriter.
- approx_quantile,- statement_params,- covand- crosstabon class- snowflake.snowpark.DataFrameStatFunctions.
- registerand- register_from_fileon class- snowflake.snowpark.udf.UDFRegistration.
- registerand- register_from_fileon class- snowflake.snowpark.udtf.UDTFRegistration.
- registerand- register_from_fileon class- snowflake.snowpark.stored_procedure.StoredProcedureRegistration.
- udf,- udtfand- sprocin- snowflake.snowpark.functions.
 
- Added support for Columnas an input argument tosession.call().
- Added support for table_typeindf.write.save_as_table(). You can now choose from thesetable_typeoptions:"temporary","temp", and"transient".
Improvements:
- Added validation of object name in session.use_*methods.
- Updated the query tag in SQL to escape it when it has special characters.
- Added a check to see if Anaconda terms are acknowledged when adding missing packages.
Bug Fixes:
- Fixed the limited length of the string column in session.create_dataframe().
- Fixed a bug in which session.create_dataframe()mistakenly converted 0 andFalsetoNonewhen the input data was only a list.
- Fixed a bug in which calling session.create_dataframe()using a large local dataset sometimes created a temp table twice.
- Aligned the definition of function.trim()with the SQL function definition.
- Fixed an issue where snowpark-python would hang when using the Python system-defined (built-in function) sumvs. the Snowparkfunction.sum().
Deprecations:
- Deprecated keyword argument create_temp_tableindf.write.save_as_table().
0.7.0 (2022-05-25)
New Features:
- Added support for user-defined table functions (UDTFs).
- Use function snowflake.snowpark.functions.udtf()to register a UDTF, or use it as a decorator to register the UDTF.- You can also use Session.udtf.register()to register a UDTF.
 
- You can also use 
- Use Session.udtf.register_from_file()to register a UDTF from a Python file.
 
- Use function 
- Updated APIs to query a table function, including both Snowflake built-in table functions and UDTFs.
- Use function snowflake.snowpark.functions.table_function()to create a callable representing a table function and use it to call the table function in a query.
- Alternatively, use function snowflake.snowpark.functions.call_table_function()to call a table function.
- Added support for overclause that specifiespartition byandorder bywhen lateral joining a table function.
- Updated Session.table_function()andDataFrame.join_table_function()to acceptTableFunctionCallinstances.
 
- Use function 
Breaking Changes:
- When creating a function with functions.udf()andfunctions.sproc(), you can now specify an empty list for theimportsorpackagesargument to indicate that no import or package is used for this UDF or stored procedure. Previously, specifying an empty list meant that the function would use session-level imports or packages.
- Improved the __repr__implementation of data types intypes.py. The unusedtype_nameproperty has been removed.
- Added a Snowpark-specific exception class for SQL errors. This replaces the previous ProgrammingErrorfrom the Python connector.
Improvements:
- Added a lock to a UDF or UDTF when it is called for the first time per thread.
- Improved the error message for pickling errors that occurred during UDF creation.
- Included the query ID when logging the failed query.
Bug Fixes:
- Fixed a bug in which non-integral data (such as timestamps) was occasionally converted to integer when calling DataFrame.to_pandas().
- Fixed a bug in which DataFrameReader.parquet()failed to read a parquet file when its column contained spaces.
- Fixed a bug in which DataFrame.copy_into_table()failed when the dataframe is created by reading a file with inferred schemas.
Deprecations
Session.flatten() and DataFrame.flatten().
Dependency Updates:
- Restricted the version of cloudpickle<=2.0.0.
0.6.0 (2022-04-27)
New Features:
- Added support for vectorized UDFs with the input as a pandas DataFrame or pandas Series and the output as a pandas Series. This improves the performance of UDFs in Snowpark.
- Added support for inferring the schema of a DataFrame by default when it is created by reading a Parquet, Avro, or ORC file in the stage.
- Added functions current_session(),current_statement(),current_user(),current_version(),current_warehouse(),date_from_parts(),date_trunc(),dayname(),dayofmonth(),dayofweek(),dayofyear(),grouping(),grouping_id(),hour(),last_day(),minute(),next_day(),previous_day(),second(),month(),monthname(),quarter(),year(),current_database(),current_role(),current_schema(),current_schemas(),current_region(),current_avaliable_roles(),add_months(),any_value(),bitnot(),bitshiftleft(),bitshiftright(),convert_timezone(),uniform(),strtok_to_array(),sysdate(),time_from_parts(),timestamp_from_parts(),timestamp_ltz_from_parts(),timestamp_ntz_from_parts(),timestamp_tz_from_parts(),weekofyear(),percentile_cont()tosnowflake.snowflake.functions.
Breaking Changes:
- Expired deprecations:
- Removed the following APIs that were deprecated in 0.4.0: DataFrame.groupByGroupingSets(),DataFrame.naturalJoin(),DataFrame.joinTableFunction,DataFrame.withColumns(),Session.getImports(),Session.addImport(),Session.removeImport(),Session.clearImports(),Session.getSessionStage(),Session.getDefaultDatabase(),Session.getDefaultSchema(),Session.getCurrentDatabase(),Session.getCurrentSchema(),Session.getFullyQualifiedCurrentSchema().
 
- Removed the following APIs that were deprecated in 0.4.0: 
Improvements:
- Added support for creating an empty DataFramewith a specific schema using theSession.create_dataframe()method.
- Changed the logging level from INFOtoDEBUGfor several logs (e.g., the executed query) when evaluating a dataframe.
- Improved the error message when failing to create a UDF due to pickle errors.
Bug Fixes:
- Removed pandas hard dependencies in the Session.create_dataframe()method.
Dependency Updates:
- Added typing-extensionas a new dependency with the version >=4.1.0.
0.5.0 (2022-03-22)
New Features
- Added stored procedures API.
- Added Session.sprocproperty andsproc()tosnowflake.snowpark.functions, so you can register stored procedures.
- Added Session.callto call stored procedures by name.
 
- Added 
- Added UDFRegistration.register_from_file()to allow registering UDFs from Python source files or zip files directly.
- Added UDFRegistration.describe()to describe a UDF.
- Added DataFrame.random_split()to provide a way to randomly split a dataframe.
- Added functions md5(),sha1(),sha2(),ascii(),initcap(),length(),lower(),lpad(),ltrim(),rpad(),rtrim(),repeat(),soundex(),regexp_count(),replace(),charindex(),collate(),collation(),insert(),left(),right(),endswith()tosnowflake.snowpark.functions.
- Allowed call_udf()to accept literal values.
- Provided a distinctkeyword inarray_agg().
Bug Fixes:
- Fixed an issue that caused DataFrame.to_pandas()to have a string column ifColumn.cast(IntegerType())was used.
- Fixed a bug in DataFrame.describe()when there is more than one string column.
0.4.0 (2022-02-15)
New Features
- You can now specify which Anaconda packages to use when defining UDFs.
- Added add_packages(),get_packages(),clear_packages(), andremove_package(), to classSession.
- Added add_requirements()toSessionso you can use a requirements file to specify which packages this session will use.
- Added parameter packagesto functionsnowflake.snowpark.functions.udf()and methodUserDefinedFunction.register()to indicate UDF-level Anaconda package dependencies when creating a UDF.
- Added parameter importstosnowflake.snowpark.functions.udf()andUserDefinedFunction.register()to specify UDF-level code imports.
 
- Added 
- Added a parameter sessionto functionudf()andUserDefinedFunction.register()so you can specify which session to use to create a UDF if you have multiple sessions.
- Added types GeographyandVarianttosnowflake.snowpark.typesto be used as type hints for Geography and Variant data when defining a UDF.
- Added support for Geography geoJSON data.
- Added Table, a subclass ofDataFramefor table operations:- Methods updateanddeleteupdate and delete rows of a table in Snowflake.
- Method mergemerges data from aDataFrameto aTable.
- Override method DataFrame.sample()with an additional parameterseed, which works on tables but not on view and sub-queries.
 
- Methods 
- Added DataFrame.to_local_iterator()andDataFrame.to_pandas_batches()to allow getting results from an iterator when the result set returned from the Snowflake database is too large.
- Added DataFrame.cache_result()for caching the operations performed on aDataFramein a temporary table. Subsequent operations on the originalDataFramehave no effect on the cached resultDataFrame.
- Added property DataFrame.queriesto get SQL queries that will be executed to evaluate theDataFrame.
- Added Session.query_history()as a context manager to track SQL queries executed on a session, including all SQL queries to evaluateDataFrames created from a session. Both query ID and query text are recorded.
- You can now create a Sessioninstance from an existing establishedsnowflake.connector.SnowflakeConnection. Use parameterconnectioninSession.builder.configs().
- Added use_database(),use_schema(),use_warehouse(), anduse_role()to classSessionto switch database/schema/warehouse/role after a session is created.
- Added DataFrameWriter.copy_into_table()to unload aDataFrameto stage files.
- Added DataFrame.unpivot().
- Added Column.within_group()for sorting the rows by columns with some aggregation functions.
- Added functions listagg(),mode(),div0(),acos(),asin(),atan(),atan2(),cos(),cosh(),sin(),sinh(),tan(),tanh(),degrees(),radians(),round(),trunc(), andfactorial()tosnowflake.snowflake.functions.
- Added an optional argument ignore_nullsin functionlead()andlag().
- The conditionparameter of functionwhen()andiff()now accepts SQL expressions.
Improvements
- All function and method names have been renamed to use the snake case naming style, which is more Pythonic. For convenience, some camel case names are kept as aliases to the snake case APIs. It is recommended to use the snake case APIs.
- Deprecated these methods on class Sessionand replaced them with their snake case equivalents:getImports(),addImports(),removeImport(),clearImports(),getSessionStage(),getDefaultSchema(),getDefaultSchema(),getCurrentDatabase(),getFullyQualifiedCurrentSchema().
- Deprecated these methods on class DataFrameand replaced them with their snake case equivalents:groupingByGroupingSets(),naturalJoin(),withColumns(),joinTableFunction().
 
- Deprecated these methods on class 
- Property DataFrame.columnsis now consistent withDataFrame.schema.namesand the Snowflake databaseIdentifier Requirements.
- Column.__bool__()now raises a- TypeError. This will ban the use of logical operators- and,- or,- noton- Columnobject, for instance- col("a") > 1 and col("b") > 2will raise the- TypeError. Use- (col("a") > 1) & (col("b") > 2)instead.
- Changed PutResultandGetResultto subclassNamedTuple.
- Fixed a bug which raised an error when the local path or stage location has a space or other special characters.
- Changed DataFrame.describe()so that non-numeric and non-string columns are ignored instead of raising an exception.
Dependency updates
- Updated snowflake-connector-pythonto 2.7.4.
0.3.0 (2022-01-09)
New Features
- Added Column.isin(), with an aliasColumn.in_().
- Added Column.try_cast(), which is a special version ofcast(). It tries to cast a string expression to other types and returnsnullif the cast is not possible.
- Added Column.startswith()andColumn.substr()to process string columns.
- Column.cast()now also accepts a- strvalue to indicate the cast type in addition to a- DataTypeinstance.
- Added DataFrame.describe()to summarize stats of aDataFrame.
- Added DataFrame.explain()to print the query plan of aDataFrame.
- DataFrame.filter()and- DataFrame.select_expr()now accepts a sql expression.
- Added a new boolparametercreate_temp_tableto methodsDataFrame.saveAsTable()andSession.write_pandas()to optionally create a temp table.
- Added DataFrame.minus()andDataFrame.subtract()as aliases toDataFrame.except_().
- Added regexp_replace(),concat(),concat_ws(),to_char(),current_timestamp(),current_date(),current_time(),months_between(),cast(),try_cast(),greatest(),least(), andhash()to modulesnowflake.snowpark.functions.
Bug Fixes
- Fixed an issue where Session.createDataFrame(pandas_df)andSession.write_pandas(pandas_df)raise an exception when thepandas DataFramehas spaces in the column name.
- DataFrame.copy_into_table()sometimes prints an- errorlevel log entry while it actually works. It's fixed now.
- Fixed an API docs issue where some DataFrameAPIs are missing from the docs.
Dependency updates
- Update snowflake-connector-pythonto 2.7.2, which upgradespyarrowdependency to 6.0.x. Refer to the python connector 2.7.2 release notes for more details.
0.2.0 (2021-12-02)
New Features
- Updated the Session.createDataFrame()method for creating aDataFramefrom a pandas DataFrame.
- Added the Session.write_pandas()method for writing apandas DataFrameto a table in Snowflake and getting aSnowpark DataFrameobject back.
- Added new classes and methods for calling window functions.
- Added the new functions cume_dist(), to find the cumulative distribution of a value with regard to other values within a window partition, androw_number(), which returns a unique row number for each row within a window partition.
- Added functions for computing statistics for DataFrames in the DataFrameStatFunctionsclass.
- Added functions for handling missing values in a DataFrame in the DataFrameNaFunctionsclass.
- Added new methods rollup(),cube(), andpivot()to theDataFrameclass.
- Added the GroupingSetsclass, which you can use with the DataFrame groupByGroupingSets method to perform a SQL GROUP BY GROUPING SETS.
- Added the new FileOperation(session)class that you can use to upload and download files to and from a stage.
- Added the DataFrame.copy_into_table()method for loading data from files in a stage into a table.
- In CASE expressions, the functions when()andotherwise()now accept Python types in addition toColumnobjects.
- When you register a UDF you can now optionally set the replaceparameter toTrueto overwrite an existing UDF with the same name.
Improvements
- UDFs are now compressed before they are uploaded to the server. This makes them about 10 times smaller, which can help when you are using large ML model files.
- When the size of a UDF is less than 8196 bytes, it will be uploaded as in-line code instead of uploaded to a stage.
Bug Fixes
- Fixed an issue where the statement df.select(when(col("a") == 1, 4).otherwise(col("a"))), [Row(4), Row(2), Row(3)]raised an exception.
- Fixed an issue where df.toPandas()raised an exception when a DataFrame was created from large local data.
0.1.0 (2021-10-26)
Start of Private Preview
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
    Details for the file snowflake_snowpark_python-1.42.0.tar.gz.
  
File metadata
- Download URL: snowflake_snowpark_python-1.42.0.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 | e994b3860c816d1b5fdf0c6272f8d9e41505e470140b063ff9418d234fd8cc00 | |
| MD5 | 33fe4482cc3a82b59977eb5e4d0b5f3b | |
| BLAKE2b-256 | 78e3b70799997481185cdad44b0786c7597764935d78d71632b4735fb05d63a1 | 
File details
    Details for the file snowflake_snowpark_python-1.42.0-py3-none-any.whl.
  
File metadata
- Download URL: snowflake_snowpark_python-1.42.0-py3-none-any.whl
- Upload date:
- Size: 1.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 | fd92a3633b79573bb481b6e85a1434842758637dc6a30b32b9c5ce2824f4296d | |
| MD5 | 32aaf6a045e916e266625b12f9170ec7 | |
| BLAKE2b-256 | a3eaa3f1ff82aa144fd072f4be440ed636f4c298a7ee7a278e68709cf2753da5 |