:py:mod:`summary_stats_baseline`
================================

.. py:module:: summary_stats_baseline

.. autoapi-nested-parse::

   Summary stats baseline computations


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   summary_stats_baseline.SummaryStats
   summary_stats_baseline.Matcher


Functions
~~~~~~~~~

.. autoapisummary::

   summary_stats_baseline.get_number_counts
   summary_stats_baseline.get_inv_dist_number_counts
   summary_stats_baseline.get_optimal_threshold
   summary_stats_baseline.match


.. py:function:: get_number_counts(x, batch_indices)

   Get the unweighted number counts

   Parameters
   ----------
   x : torch.tensor
       Input features of shape [n_nodes, n_features] for a given batch
   batch_indices : torch.tensor
       Batch indices of shape [n_nodes,] for a given batch


.. py:function:: get_inv_dist_number_counts(x, batch_indices, pos_indices)

   Get the inverse-dist weighted number counts

   Parameters
   ----------
   x : torch.tensor
       Input features of shape [n_nodes, n_features] for a given batch
   batch_indices : torch.tensor
       Batch indices of shape [n_nodes,] for a given batch
   pos_indices : list
       List of the two indices corresponding to ra, dec in x


.. py:class:: SummaryStats(n_data, pos_indices=[0, 1])

   .. py:method:: update(batch, i)

      Update `stats` for a new batch

      Parameters
      ----------
      batch : array or dict
          new batch of data whose data can be accessed by the functions in
          `loader_dict`
      i : int
          index indicating that the batch is the i-th batch


   .. py:method:: set_stats(stats_path)

      Loads a previously stored stats

      Parameters
      ----------
      stats_path : str
          Path to the .npy file of the stats dictionary


   .. py:method:: export_stats(stats_path)

      Exports the stats attribute to disk as a npy file

      Parameters
      ----------
      stats_path : str
          Path to the .npy file of the stats dictionary


.. py:class:: Matcher(train_stats, test_stats, train_y, out_dir, test_y=None)

   .. py:method:: match_summary_stats(thresholds, interim_pdf_func=None, min_matches=1000, k_max=np.inf)

      Match summary stats between train and test

      Parameters
      ----------
      thresholds : dict
          Matching thresholds for summary stats
          Keys should be one or both of 'N' and 'N_inv_dist'.
      interim_pdf_func : callable, optional
          Interim prior PDF with which to reweight the samples


   .. py:method:: get_samples(idx, ss_name, threshold=None)

      Get the pre-weighting (raw) accepted samples

      Parameters
      ----------
      idx : int
          ID of sightline
      ss_name : str
          Summary stats name
      threshold : int, optional
          Matching threshold. If None, use the optimal threshold.
          Default: None

      Returns
      -------
      np.ndarray
          Samples of shape `[n_matches]`


   .. py:method:: get_overview_table()


.. py:function:: get_optimal_threshold(thresholds, n_matches, min_matches=1000)

   Get the smallest threshold that has some minimum number of matches

   Parameters
   ----------
   thresholds : array-like
   n_matches : array-like
   min_matches : int


.. py:function:: match(train_x, test_x, train_y, threshold)

   Match summary stats between train and test within given threshold

   Parameters
   ----------
   train_x : np.ndarray
       train summary stats
   test_x : float
       test summary stats
   train_y : np.ndarray
       train target values
   threshold : float
       closeness threshold matching is based on

   Returns
   -------
   tuple
       boolean mask of accepted samples for train_y and the accepted
       samples