Group 1

Here is our group’s proposal.

The Centers for Medicare & Medicaid Services (CMS) has made a data set available with “information on prescription drug events incurred by Medicare beneficiaries with a Part D prescription drug plan”; specifically, for each combination of NPI and drug it has the amounts prescribed and the cost incurred. The data set, built from 2013 data, has over 23 million rows where each row corresponds to a particular drug and a particular physician. There are 808,000 unique NPIs and 2700 unique drugs. By far the most commonly occurring drug is hydrocodone-acetaminophen and the physician with by far the most unique drug prescriptions is one Dr. Daniel Hurley of Beech Grove, IN with over 540 different drugs.

We want to consider the subgraph of the NPI referral network corresponding only to the NPIs in this sample and see how this relates to the overall referral network. Our first task will be to investigate any systematic differences between the subgraph and the supergraph; for example, we want to see how high degree nodes and low degree nodes of the supergraph appear in the sample. We are also interested in how the structure of the supergraph is mapped to the subgraph. To that end we will look at how triangles and communities of the supergraph appear in the subgraph. Other questions will certainly arise once we really get into this but for now these are the main questions that we have.

There are known issues with this data set, such as that it only has information on Medicare related drug use and that there may be some misattribution of organizational behavior to specific NPIs; nevertheless, we hope that there will be some interesting and meaningful patterns revealed by considering how this subgraph differs from the network as a whole.

Response

You guys sound like you are on a good path… keep going. Of course, this still needs further development.

Think about what “functions” you want to explain wrt the prescription data set. One interesting summary statistic might be someone’s “entropy” in their drug prescription; it is a measure of how variable they are in making prescriptions; intuitively, if someone is making a wide array of prescriptions, then they are seeing an diverse/interesting patient population AND they have a lot of knowledge about drugs.

Where are those high entropy people in the referral network? Are they the first doctors that you see or are they the last doctors that you see? Presumably this varies by community; which community has better health outcomes? K

Group 2

The data we’re going to investigate is the Medicare Physician and Other Supplier NPI Aggregate table, found at the CMS website. The table was well organized by NPI and contains columns of categorical and numerical data to be analyzed. Further information on the data can be found at the Methodology Handbook

Till now, several interesting questions have been propsed by our group members. The first thing is to match the data with 2013 physician referral network by NPI, and actually the two datasets match pretty well.

After linking the data to referral network, we can start from exploring the relationship between the NPI’s and the medicare payments, e. g., the average charge an NPI provider submitted, or the average amount of medicare payments. It is interesting to see whether the referral network has some patterns in terms of the medicare: is physicians with higher payments tend to refer their patients to other high-payment physisians? To answer questions like this, network analysis techniques would be heavily applied here.

The data also includes features of several kinds of diseases among beneficiaries, such as cancer, diabetes, depression and so on. Obviously, the ratio of each kind of illness varies from one NPI to another, since different doctors expertise in different fields. Hopefully, we can get a more explicit conclusion on this issue after exploring the data. We can also study the network for some specific diseases, such as the network of heart issue physicians in Wisconsin, and compare it with other states.

In addition to our analysis, we will explore the geographical relationship among and within those networks and clusters derived from previous analysis, by using the zip-codes in the dataset. We may also obtain some insights into this referral network by looking at the geographical distribution of the physicians and hospitals. In fact, showing clusters on a map will always be informative and inspiring for our study.

Hopefully, by the end of the project, we’ll be able to find some interesting results using social network techiques. Note that there’re five members in our group, for the fact that we all live in the same apartment and have conformable schedules to meet and discuss; and we suppose it won’t be overloaded since there’re quite a few things that we’d like to investigate in this dataset.

Response

I like your idea of using the clusters from the zipcode analysis. Think about summary measures of the networks INSIDE each of those clusters, I.e. Think of each cluster as a unit of observation!

Group 3

The papers by Uddin, Hossain et al and Barnett, Christakis, et al conclude that there is a strong correlation between physician network structure and cost of hospitalization, rate of readmission, and care pattern. Complications during a clinical procedure is one of the causes of increasing cost of hospitalization as well as of readmission. In this project we are going to study the correlation between physician referral network structure (WHAT EXACTLY ABOUT THE STRUCTURE ARE YOU GOING TO STUDY? LOTS OF POSSIBILITIES! ) and complications during certain clinical procedures. This may lead to recommendations for improvements in the referral network structure of a certain healthcare organization. We are planning to use the hospital complication datasets at https://data.medicare.gov/data/hospital-compare, e.g. the data set that includes provider data for the hip/knee complication measure (the score column here is a representation of rate of complications), and the Agency for Healthcare Research and Quality (AHRQ) measures of serious complications (GOOD!). We plan to correlate that to our physician referral network data set through CCN (Claims based hospital affiliation) columns included in Physicians_compare dataset. We understand that there may be factors other than hospital quality that lead to increased rate of complications and/or readmission. We plan to investigate whether the average income of a certain area (zipcode) has any relation to the rate of complications. We know that there exist datasets for costs paid through medicare for a certain physician. Time permitting, we plan to see what effect the rate of complications has on the actual payments.

Response

Think more about what “structure” in the social network you want to study.

Group 4

Group focus: Health Care Provider Re-Admission rates in relation to providers’ average physician coreness, medicare spending, surrounding population average income and surrounding population level of health. MOAR DATA ON INCOME: http://www.hipxchange.org/ADI

Data we’ll be using includes readmission data by hospital from data.medicare.gov, surrounding population information data from ACS and medicare spending from CMS.gov in addition to the data sets we’ve been using in the class. As some of our data is only available on the provider level, much of our analysis will be done on the provider level (I LIKE THIS LEVEL OF ANALYSIS). While we can’t generalize provider data to physicians so easily, we can gain insight about providers based on their physicians. We’re interested in analyzing the the average physician coreness for a provider and what that says about re-admissions, Mascia’s paper on re-admissions and referral network has shown there to be interesting relationships here.

We still need to consult Raj or some doctor to better understand different measures of re-admission and how re-admission rate can represent the “goodness” of a hospital. One of our first steps would be to gain deeper understanding of the data for further analysis.

Response

GOOD. YOU CAN THINK ABOUT OTHER SUMMARY MEASURES BEYOND CORENESS. FOR EXAMPLE, DOES THE NETWORK EASILY PARTITION? HOW STRONG IS THE RELATIONSHIP BETWEEN SPECIALTY TYPE A AND TYPE B? ETC.

Good work! Keep going!

Group 5

                1 Data Files

                In this project, we used two data files: Medicare Provider Utilization and Payment Data

and Physician Referral Data.

                2

                1.



                We are interested in the following problems:

                What’s the referral network for different Medicare Specialties?

                To study this problem, we leverage spectral clustering on Stochastic Block Model.

We cluster receiving nodes by using prior cluster information (i.e. Medicare Special- ties) of sending nodes, and vice versa. Then, we will see how different the Medicare Specialties perform (“PERFORM” BY WHAT METRIC?) in referral network from both the view of receiving and sending, and what are the differences and similarities of sending or receiving communities between different Medicare Specialties. (We mainly use the variable Provider Type in payment data file. Due to the complexity of type, we only focus on Wisconsin organizations here. ) (YOU COULD COMPARE ONE LOCATION TO ANOTHER LOCATION)

                What’s the referral network for different Medicare Allowed Amounts?

To study this problem, we first divide Medicare Allowed Amounts into levels: low, medium, and high. (ARE YOU GOING TO CONTROL FOR SPECIALTY?) By using this prior cluster information on spectral clustering under SBM, we study how different levels of Medicare Allowed Amount perform in referral network from both the view of receiving and sending. We can also repeat these processes with real Medicare Payment Amounts, and to see whether the two conclusions have any relation or difference.

                1





                2.







                3. We will study HCPCS description, i.e. healthcare common procedure description,

to see how medicare procedures differ from different clusters of referral network. (HCPCS represents Healthcare Common Procedure Coding System.)

                4. Odds and Ends:

1. (a) What’s the number of so-called directed loop triangles in referral network?

2. (b) There are only 11 types of organizations in Wisconsin, while there are 35 types of organizations in the whole data file. What are those types that we don?t have in WI, how many are they, and where are they?

3. (c) About Medicare Payment Amounts

1. How differently beneficiaries benefit over referral network for different Medi- care Specialties;

2. Therelationshipbetweenaveragemedicaresubmittedchargeamount/average medicare payment amount/ benefits and referral distance

3. Therelationshipbetweenaveragemedicaresubmittedchargeamount/average medicare payment amount/ benefits and living costs based on the zip codes of the referral network.

Response

IT SOUNDS LIKE YOU HAVE THREE PROJECTS HERE. PICK ONE AND DEVELOP IT MORE.