Welcome to the Eros User Experience webinar series, where we talk to staff at EROS to learn more about the data, tools and services coming out of the USGS Earth Resources Observation and Science, or EROS Center. Today's webinar is entitled An Introduction to Landsat Data Access and Processing in the cloud. I'm your host, Danielle Golon. The remote sensing user services lead here at EROS in Sioux Falls, South Dakota.

The time is currently 12 p.m. central, so we'll go ahead and get started. First, a few logistics to ensure the best audio experience. All participants have been muted. If you have any questions or comments during the webinar, please add them in the chat and we will address them at the end of the webinar. If the chat does not work for you, please feel free to email your questions to custserv@usgs.gov and we will answer them there.

Today's webinar is being recorded. The recording will be available later on the USGS Landsat website, as well as the USGS trainings, YouTube channel and the USGS Media Gallery. At the start and end of the webinar, we will have a few polling questions. These polls are optional, but your answers can help us create a better user experience in the future.

The polling questions will be available via the polls feature in teams or in the teams chat for our audience members who are not able to use the teams post feature due to their organization settings. If the post features do not work for you, please feel free to respond to the polling questions in the chat instead. The questions are the same, so please either use the polls feature or the chat.

Whichever option works best for you and I will go ahead and launch our polls now while we finish the introduction to the webinar. You should now be able to see the first set of polling questions, both in that post feature or in the chat, so feel free to fill those out at your leisure. Today's webinar will consist of a presentation, several live demonstrations, and then a question and answer session at the end.

Today's speaker is Tonian Robinson, a geospatial cloud support scientist with the USGS, EROS user services team and the annual National Land Cover Database, or NLC team here at the USGS EROS, a graduate of the University of South Florida and Tampa, Florida, Tonian has a PhD in geology focusing on geophysics. Tonian has worked as a contractor at EROS since the summer of 2023.

Tonian presentation will provide an overview on Landsat data in the cloud, as well as demonstrations of several scripts Tonian and the team have written on accessing and working with Landsat data in the cloud. Once Tonian has finished your presentation, we will then transition over to an optional set of final polling questions, and then we'll move on to that Q&A portion of the webinar.

We have several Aero staff members from user services, the internal Landsat science team, and our Access and Archive cloud developers here at EROS on the line to help answer any questions you may have. After Tonian has finished your presentation again, please feel free to add your questions or feedback throughout the webinar using the webinar chat. We'll try to answer all of our questions within the time allotted, but if we're not able to address your question during the Q&A portion, we will follow up with you offline.

If there's a future webinar topic you'd like us to cover, please feel free to suggest that in the chat as well. With that, it's my pleasure to introduce today's speaker, Tonian Robinson. Take it away Tonian hello everyone. I am Tonian Robinson and welcome to this introduction to the materials that we've created in the past year to help you guys get started with working with Landsat data in the cloud.

To start a brief outline, first, I'm going to discuss how Landsat data is stored in the cloud. I'm going to fully demo one tutorial and walk through the HTML format of two other tutorials. And then I'm going to highlight some valuable resources to help you guys get started with working in the cloud. To start, yes, Landsat data is available in the cloud.

It's available in the Amazon Web Services S3 bucket located in the Oregon, US West two region. Users will need to specify direct requester pays when they're accessing this data. So what is available in the cloud. So currently we have level one radiance products. And these are the geometric and radiometric corrected products. Level two which are currently the atmospheric corrected products.

And they're the surface reflectance and the surface temperature products. Then we have the ARD analysis for analysis ready products that are available in for a level three, the burned area, the fractional store area and the dynamic surface water products are available. So how is data stored in the cloud? So here's an example with a level two product that was collected with the Thematic Mapper in 2011.

And this is the product's full name. Where is it stored? First, it's stored in the S3 bucket, the USGS Landsat S3 bucket. And this is a surface reflectance data product. Right away it's stored first in the USGS. S3 bucket is stored under its collection number, which is two, and that's highlighted in the name. It's then stored in its level number, which is level two.

Since the level two product, it's then stored under projection, which is a standard projection and that's not highlighted in the name. And the standard production in this case is the W GSA, T4, UTM projection that is used. It's then stored under its sensor. In this case it's a thematic sensor which is labeled as term here, but labeled as T, and its name is then stored under its year, which is the year it's collected and that is highlighted in its name.

And then it's stored under its path and row, which are also highlighted in its name. And lastly, it will be under its entire product name. So under this product structure, there are various objects ranging from the metadata to band products that are downloadable for the specific product. Now that you know this data is stored, now let's walk through how to find these data.

And first step into finding the data is understanding how the metadata is structured. So Landsat uses the STAC which is a spatial temporal asset catalog. And it's just a family of specifications that standardizes geospatial metadata. So it makes it easier for you to access the metadata related to geospatial products. So we have a Landsat STAC catalog and the Landsat STAC catalog includes various collections.

And these collections are groups of Landsat products. For example, a Landsat STAC collection is the surface reflectance collection, for example, which is separate from the surface temperature collection. There are about 14 Landsat STAC collections, and within each collection, their individual Landsat data products, and those are referred to as STAC items. And lastly, within each item there are various assets, and these range from the thumbnail, the metadata, and the various links to download the data that is available per the item.

We have various ways for you guys to interact with the STAC. The first one here is the Landsat STAC browser, which is available through the USGS, is really a way to just click through based on the scene, the tile or a level three product just to find a product to see what is available with this product, what metadata is associated with this product.

And this is the non programmatic way of interacting with the STAC. And here's an example with a burnt area product. This single product upon clicking through to find this product in the STAC browser you will see the outline of this product. You will also see the metadata to the right, and there are tabs on top that can click through to see the assets and the various bands and the thumbnail.

The second way to non programmatically interact with the STAC is through the STAC index. STAC index orders all of the Landsat STAC collections. First, it starts with the collection names listed with their title. There are 14 of them. You can click through any depending on the product you want. You can click through to find individual scenes and see where they are located on the map, and you can see the various data associated with them.

Similarly, this is a example using analysis already data set. Here is the outline of the scene and also the metadata is located on the right. You can scroll through to see what is associated with this data. So programmatically now in the USGS basically the user services we provide scripts Python notebooks to show you how to work in the cloud with Landsat data.

And they're housed in the USGS jet lab that is just highlighted here. I'll be going to the Jet Lab after this brief introduction to it. What is available in the jet? We have four projects related to the cloud. First one is about accessing data, which is how to search and pull down data. The second is processing which is anything from calculating indices to filtering.

And also visualizing is a lot to visualize. And of course there are quick guides which are short guides of how to do something like create a GeoJSON file. And there are case studies which are newer, and their real world examples of how to process Landsat in the cloud. From here, I'll be going to the demo first. I briefly mentioned the three tutorials that I'll be walking through.

The first tutorial is just an introduction. To STAC this one, I'll actually be downloading it and running it live. The second one, this is just related to decoding the pixel mask and using it for masking. I will not be downloading this, but I'll be walking through it's HTML file and the third one is related to pulling down a single pixel through time and filtering it and also calculating indices.

So with that, I'm just going to head over to the Jet Lab and introduce you guys to the Jet Lab. So here is the USGS get lab. I am not signed in so that you can see what is readily available to you as a user. As you can see, there are five projects I discussed. The ones that are related to cloud.

However, we have a machine to machine project available which it just focuses on pulling down data using the machine to machine API since it's not cloud. I did not discuss this, but we have three machine to machine projects that are tutorials that are available in every cloud tutorial we currently have about six tutorials in access, in all related to interacting with the STAC API to pull data down.

In case studies, we currently have one which is related to compositing in the ukulele region in Peru, in processing in the cloud. There about seven and more pending to be uploaded. Various ways to process and filter the data and in quick guides currently have two guides related to creating things which are really great source in searching or interacting with the STAC to find use agents to search and find data.

So before I actually download the first tutorial, which I'm going to demo, I'm just going to go through and show you, for example, how the tutorials are set up in the lab. So here's an example with the recent Creating Composites in Landsat data tutorial. In every tutorial there is the notebook, the HTML file of the notebook that is run.

There is an axis y ml file, and that is just the file that you use to create your Python environment. Sometimes there are data files with just images. I'll put it and util folders which include the GeoJSON that is used to for the area. The Readme file is presented below the list of the files, and it's really just an introduction to the tutorial itself, which introduces the tutorial of compositing where it's located and for every processing there are prerequisites that is listed and a table of contents so you can see what's in the tutorial.

And lastly, and importantly, we also have how to set up the Python environment using the email file that is given to you as a user.

This is what is provided with the tutorials. So I'm going to head back out to the main branch. And I'm going to go find that role that I'll be demonstrating. And it's the introduction to the Landsat STAC.

Which is great starting point if you're interested. So I'm going to first copy the https link for this to pull it down using git. So I'm in an empty directory that I created for this webinar. And I'm just going to activate that environment with it's already created. But I'm just going to activate it to get started. Access in Landsat data.

And then I'm going to say git clone. And I'm pasting that file for the tutorial, the link to the URL. And I'm pulling it down into this directory that I have open for this webinar. To open the tutorial I'm going to open Jupyter Notebook. All the tutorials are Jupyter notebooks to refer to with Python notebooks.

I'm going to open the tutorial and there is everything associated with this one. This one comes with a HTML file that shows before running and one or after, and of course a .yml file if it's necessary in this case. So I'm going to open the notebook and we're going to just start running through this tutorial. So I so again this is a basic introduction to STAC the Landsat STAC.


Specifically STAC is a standard for grouping geospatial metadata. And we use STAC here at USGS for Landsat, especially if you're trying to get Landsat data from the cloud. So that is the introduction to STAC. And yes, the top is covered. What is STAC importing packages and then interacting with the API. And also showing you how to search for data.

Again we have Landsat has a STAC API, which is just a link that you interact with when you're programing the Lantern STAC catalog. And I wanted to emphasize this that I do use the term collections. Landsat STAC collections are groupings of Landsat data, while Landsat collection two. It's really a process in of the entire Landsat archive. And we are currently Atlantic collection to in terms of archival processing.

But Landsat STAC collection, when I refer to collections moving forward, I'm referring to just groupings of different mindset data sets. So we have the Landsat STAC backlog. And within the catalog there collections are groupings of data sets, and within each collection there is a STAC item which is the scene itself. Multiple Landsat scenes.

To start, we're going to only import a single tool here, which is just requests. To interact with the STAC API, you do not need Amazon Web Services account set up. It's really just making sure you have the whatever program you're using or module you're using to pull the data.

Here I'm just pulling down using the request function to interact with this Landsat STAC server and pulling down the response. This is the various links and the versions and the title of the API that are pulled down. So really just Json just the various links that are attached to the Landsat STAC.

And here again we're printing some of the more useful links. Some of these, if you click them, you will find the various ways that you can search in the Landsat STAC. However, the tutorials kind of highlight more clearly how you can search, but they're available within all the links that are available in the main sets that collect catalog.

Here I'm just printing the information, useful information related to it, the version, the ID and the type. It's a catalog and there are 22 links associated with it.

Next I'm running this cell and it's printing all the children and various information related to the links, the Landsat 14 collections that are available. The STAC collections are listed as children in the catalog. Child items.

So for the multiple STAC collections there are 14. So this. So let's count how many collections are available. And there are 14 Landsat groupings of products. And again that is the burn area versus the surface reflectance versus level one. They're all grouped separately. And here we're just listing all the collections and the descriptions for all the collections. All 14 of them.

And here we're actually looking at one of the collections. The second collection, this Python nomenclature is really one is actually a second. So with the second collection is the surface temperature. And within the metadata for this entire collection you have the bounding box, the license, the keywords are actually the platforms that are in use to collect data for this collection.

And there are various links just to endpoint specifically to this collection that are available. So within each STAC collection there are multiple items. And these are called STAC items. And these are single data products. So a single scene collected in a specific time wherever it's been collected. And here's a single product. And it comes with a lot of metadata comes with its ID, of course, its bounding box, its geometry properties.

And if I scroll further there assets. These are broken down further in the tutorial. So I'll just move on to those because it explains so within this single scene, which is a surface temperature scene. Since we had that collection selected, you have the description of it, its ID, its acquisition date, the platform that is used to collected cloud cover, and a number of assets which are the bands, the metadata and even the thumbnails.

And here in this cell, I'll be just printing out the various assets that are associated with this single scene and URLs. So there are multiple URLs attached to the products. So you can have the Landsat look URL. You can individually click the Landsat URLs to download products if you'd like it that way. But however, STAC is best used if you're interacting with the cloud or pulling data to the cloud because every asset has a S3 link where you can use Amazon or a rest area or a Python library to pull it down.

But they do all have links that you can click if you're interested in doing that, and I won't click them, but they're there and they work. Next up in section three we're just setting up a search. And this is really just creating a parameter dictionary to interact with the Landsat STAC endpoint. So first step here they're just pulling the link from the Landsat STAC endpoint link from the catalog that was pulled down.

And now and also creating a empty parameter dictionary to start.

Set up. This creates an empty parameter dictionary here in this section. It's really just harmless is that we're going to search and are available for you to search. But there are more that are available to search. And I will show you those when we move deeper into this. It's where all the limits were first set in a limit.

And you can set a limit of up to 10,000. But here is 400.

So every item comes with a bounding box. And here's an example. If you know the bounding or if you want to use a bounding box of an item, you could use it by specifying it. But here we're seeing a bounding box in a different area for our search, just to not pull in the same item that we were using or looking at.

So we added a bounding box here to the parameter dictionary, along with the limit. And now we're doing a search. And what is expected from this search is 400 to be returned because there's a limit of 400. And we expect more than 400 data sets to interact. When you're only just specifying a bounding box, the next step is adding a temporal query.

And this is really just what they call a ISO string DateTime object you include into parameter dictionary. I'm going to run that to add it and see it's updated with the date time to the dictionary, and then I'm going to run it. So now we're limiting the search to a specific date range. And now it's 144 products that are returned.

And reminder view we can specify collections. And I'm going to be going heading to that. So in section three you can specify a collection. And in this case you are specifying the two collection the surface reflectance and the surface temperature. To further limit or strictly search. So we're adding two collections to it to the parameter dictionary. And here in 20 I'm going to run with the collections added.

And now we reduce from 144 to 28. Since only a few of the products were actually from the surface effect and surface temperature collections. Next again, we're printing the single item, and the first item that is returned is a surface temperature product. And again it has all metadata associated with the product the geometry, the ID, the properties, and the assets.

So in this session line of code, I'm going to run this to show you guys the various properties associated with this single item. And this is important because these properties are what are searchable for this item. So everything from the date, time, the cloud cover, the scene ID and even the projection shape are searchable using STAC for this item.

And you search properties by creating a query or adding a query to the parameter dictionary and in this session that I just ran, we're adding a cloud cover range from 0 to 60% and restricting it to a platforms only from Landsat eight and nine.

And I'm going to run to see the results. And the results show only 12 returns for that specific date range, bounding box collection and cloud range and platforms. So here in this very last I'm just going to run to see the show you the results. So after all those specifications, we only have data from Landsat eight runs at nine.

They only have cloud cover below 60%. And they're all from the surface temperature. And the surface reflectance data sets. Thank you to Holly for creating this introduction. To STAC the next two tutorials I'm going to walk through are the HTML formats of them. And they're related to processing Landsat data in the cloud. So to start I'll be heading to the decoding the pixel Landsat pixel and using it for masking to draw.

So the pixel sharp band, it's really a quality layer that highlights the potential issues with the pixels that you're working with. For any Landsat scene, it's in 16 unsigned bit, cloud optimized view of Tiff, and it's downloadable through to S3 bucket, as is an asset with all the products, the pixel shading can mask out clouds, no shadow water, and other potential issues with your pixel.

So in this use case example, there's not much of a story. It's really just here is Vancouver. I know it has no one. I say selected the scene because the main focus is how to process the pixel QA band and use it with Python. This is the before and after result, which will also be shown in the end.

So to start we have the various prerequisites that you can use. If you don't understand something, the information will be found in the prerequisite tutorials. We then import the modules. More Python models in this case because we're actually pulling data into memory. So the great thing about working with data in the cloud is that you're pulling data into memory.

You are not downloading them directly to your device. Your computer. So the first step in many of the tutorials is creating the functions to interact with the Landsat STAC server. And this one is just a STAC server that I've been using. There are other tutorials that use the famous Pi STAC. The earlier tutorials like this one. They use this fetch STAC server that we've created, and it really just uses the requests and catches errors.

If they're present. The next step is creating a parameter payload function, which is really similar to what I did previously where we created the parameters that for searching. However, this is a very elaborate one in a sense, since it's one of the first tutorials, it's supposed to show you the various ways to set up the parameter dictionary and the things that you can include in your parameter dictionary if you would like.

That's how specific you want to go with your search. The next step is reading in a Json file. All this throws pretty much most of them use Json files. Some of them I create the bounding box in the tutorial, but this one I'm pulling a Json file into the output for use and this Json file is in the US 84 reference system.

The next step is Latin. We often create maps. Of course we're working with imagery from the earth, and here is a outline of the area plotted using the Json file that is imported. And we use folium for this pattern. So in section three I'm just setting up to interact with the STAC server to get the metadata associated with the scene that I want to pull down.

So I'm really just one setting up a baby box from the GeoJSON import the area of interest. And I'm here only really using a single query and remember, a query is the various properties in a scene that are searchable. In this case, I'm using the scene ID property to search for this scene because I know it's just going to bring me back what I want.

The single scene that I'm interested in. And here it's just I'm putting everything on a single line to create the parameter dictionary. And in this case, I'm searching for the surface reflectance data.

Now in this section we're just entering the parameter dictionary and then retrieving the product that is returning to one item. I know it will be one item because it's only one scene that is in a surface reflectance. With that scene ID and here I'm just extracting the only item that is returned. In this step I am listing the band names.

So Landsat band names are present within the assets for a single product. And here I'm pulling the shortwave infrared, the NRI and the red band names with the band names. I'm then parsing the query the single items assets to find the S3 links for download, and then I'm saving it into this band links variable. Here in this section, I'm showing you what assets are available in terms of bands.

Our data for this product, and it ranges from the coastal to the eye 20 to the square for 22 band. Next step is very important in every tutorial, especially related to processing. I discuss AWS and what is necessary. You need to have an AWS account. You need to specify direct request or paste when you are accessing Landsat data in the cloud.

And there are two ways you can set up your AWS credentials for interacting with or pulling down Landsat data. And here it's a simple function that uses the rest area for pulling down the data and cropping it to that bounding box that was imported as Json. So next step is pulling in a band into a single spectral array.

And here I'm using that function to retrieve cog_xr to pull them in a single spectral array, which will be a three banded array with its x and y length here printed out.

Next up is plotting to visualize the results, and here is the before of the image as a composite of all three bands. The next step of course, since is about pixel QA, we're pulling in the QA pixel first, pulling it in as a GeoJSON product, then I am pulling in the link and using the XHR function to pull it in as an X-ray product, which is just an array of the data set, and it brings in its coordinates and other spatial metadata associated with it.
So it's a single banded array. In this section I'm just listing the various tags that are within the QA pixel. Depending on the satellite, we can assure you that what's available in Landsat eight right now is also available in Landsat nine, and only bits 14 to 15 aren't available for launch at four and seven. And in this section, just printing out the descriptions and the bit locations.

For the bits that are available for these QA picks, the band.

Next step is masking. We here at this section very elaborate, very long. It shows you step by step how to decode the QA bit and explain bit coding so you can understand what is going on in the functions that are presented. So this entire section here is how to decode the QA bit. What do the symbols in Python mean when you're decoding a bit and how to do that for a single bit.

So upon reading this you may actually understand how to decode the bit. And these functions. These two function decoding the bit and the masking. They work together to decode a bit and to mask it per whatever obstruction you want to mask out, whether it's clouds or shadows or water. So the QA mask is uses this decode function to decode whatever bit is located at a pixel location for the Q a mask, and it includes all of the mask of all obstructions that could be in your pixel, and also includes the confidence mask and how you could do it.

It's very long. Again, it's just we're not using all of these. There's just so easy to understand how to use it to mask. But in 4.2 here I'm applying the masking. And I'm only masking for clouds though and shadow. And I'm creating a QA mask.

Here in this next bullet, broadcasting that mask to the same shape as a QA, the spectral array, which includes the three bands. So now the QA mask is now three banded, and of course, the same x and y range as the spectral array, just for a quick mask in. In this section here I am masking out the spectral band, which includes all the bands that were pulled into three bands and using the QA mask and fill in areas that are snow, clouds and shadows with zero.

The next step is plotting just a single plot, for after the masking is complete, you can see that majority, not all of the cloud and shadow and snow pixels have been masked out using the QA band. And in the end of every processing tutorial, there's some evaluation, some discussion, some comparison. So in 4.3 here I'm just again plotting the same results.

However, side by side for you to see the before and after of the results of decoding the shape it and masking for Vancouver to remove snow clouds and shadows. So with this one done, I'm going to move on to the last demo, which is the retrieving a single pixel through time. And it's really focus on how to just use one point to get data within whatever date range you would want for Landsat and whatever type of Landsat data you are pulling on what grouping.

So for a single pixel, this tutorial is really a time series tutorial. But of course it introduces point feature analysis, time series analysis, and you know, it could be your starter for a change detection analysis. The use case scenario for this tutorial is a single that happened in July 2017. In Florida, it is called the land O'Lakes sinkhole, as well as two houses, and many were condemned surrounding it because that's how large it was.

So what we're doing with this single pixel is I've selected a single pixel inside this sinkhole. I now want to see how does this pixel through time, how does it appear through time before and after the sinkhole collapse? And can we see the sinkhole collapse in the normalized different vegetation index to index, which is the vegetation index where high NDVI will indicate the presence of healthier vegetation, where a lower one is unhealthy, or just presence of water or no vegetation.

So can we see this collapse in the NDVI values within the sinkhole through time? So this image here is the Google image of the sinkhole January before it collapse. And then the May 2023, five years after its collapse. So it's been through stages. You can even see more in in this one like a bit. So to start, prerequisites tutorials.

If you are stuck somewhere you could find the information and prerequisites tutorials. Table of contents. What is available in this tutorial. Next up importing all the modules that are used from Python for this little rest area. Next up in section one. Of course setting up the fetch function. Interacting with the Landsat STAC server and next of course setting up the query function.

This one. It incorporates the fetch function in the end with the parameter creates a parameter. Then it just pulls in whatever metadata and it's hard coded for surface reflectance. But I'm really only pulling it in for this tutorial. And it's much shorter because it's like, let's just get to the point for this one. This is one of the tutorials here in section two at I create the object in the tutorial itself.

So here I'm just creating engine, this endpoint that I'll be using throughout the tutorial. One that throughout just a pop data. So these cells are really just here the Json, I'm printing it. I like printing things so people can see what it's supposed to look like, just in case something goes wrong. So that's what's happening here. Here I'm setting up the folium map with that single point.

I will not be displaying it as yet. So before I display this one, I'm going to pull in the data. So some of them I pull in the data the metadata. So I can actually print the outlines and folium. So here in 48I am setting up the STAC, querying and pulling in data that intercept that single point for the date range.

January 1st, 2016 to January 1st, 2024. All surface reflectance that was art coded in the querying and next. I am now using folium. I've added those returns to the map and here they are plotted. Good news about folium is that you can actually work with it and zoom and interact with it. It's just an HTML file like the one I'm on right now.

So here is the sinkhole. It's on Ocean Pines Drive. It is in land O'Lakes, Florida. Here's how it looks on this map. And there's that point that I'm pulling in. Point of data. And these blue lines are the scenes that intercept it. So this tutorial is focused on calculating the ndVi. But before to calculate the ndVi which is the vegetation index, you need really only the NRI and the red band.

But before I get to that, this section through here, I'm just pulling in all spectral bands. The blue green red NRI is the right one and two. I'm just showing you here's how you would pull in all special bands. So how to do this? By interest? The returns from the STAC is you list the band names. You can list the band names based on what order you actually would like them to pull in, and I'm just leaving them in whatever order that I tend to think they're in there in this order.

And here I'm just showing you how to interact with a single item to pull in all the band links to S3 links for them into a variable list. Again, these are the assets that are available in the different bands that are available for a download. I'm not pulling down a coastline, but I am pulling down the blue to the right again.

Every tutorial includes sworn in. Sometimes it's blue. This is how you set up. You need to specify a direct request to replace. Next step is the function for actually pulling the data using the rest area, and this one is just for pulling a single point and returns the values. Here I'm doing some sequence in here in 415. Using the band links the list of the links of the S3 bands and pulling in each one one by one and putting it into a single array.

And it's in the order of which you will theme the bands. So that session I just had to pause and say, I have not shown you how to pull down the list of bands. So before I jump into tutorial, let's show you how you would list them, and then I jump into tutorial and pull in the data. So now in section four I'm jumping in and I'm only pulling in the knee the red for all of the items that are returned for that single point between January 20th 16 to 20 24th January, and there are 560 Enum data products.

And I'm just clipping a single pixel from each of them using the point object. And this is just the printout of what is happening in the background. So here I am. Now that we've pulled in all the data, the NRI and the red bands, here is just a view of the array itself. It is unsigned 16 it and this is what we would call a pixel rad because it involves ten values through time.

So if you see the term pixel rad or rotten or spectral or it could be spectral values through time or pixel values through time first for multiple bands through time. So here the left column will be the knee and the right would be the red because that's how I specify pulling them in. Next step is retrieving the acquisition dates.

And that can be pulled from the query items. So I'm pulling in point dates. In section five we are pulling in the quality assessment. I'm just doing a simple quality assessment here. I'm nothing that stressful. Could do more, but I'm pulling in all the quality assessment links and then all of their values from all 516 data sets here in in 21 to 24.

It's really just cleaning up the data. Let's find all the Nans in the sand values and remove them from the point dates. The QA values, and the band values themselves. Next step is now let's do some masking. Here is a shorter QA masking function that includes its decoding in its return. So create the q mask and function. Next up, let's apply it here in section 26.

I'm applying it to only remove clouds.

I'm not that stress about anything else. And for it outside of cloud, the next step where I'm using that quality assessment array to remove the point dates and then values that are for products that have clouds, for those point locations that have pixels that have clouds. And the final stretches is calculating the ndVi. So before calculating the ndVi, we scale the data using the scaling factor for Landsat collection two and the offset.

So here we're just scaling in 29. And next is calculating the ndVi which is really just a ratio between the NRI and the red bands. And here are that ndVi variable here in 30.

Last steps converting the the dates of each remaining pixel to decimal years. And here's there you're going to see so many helper functions within the intervals. And this is the helper function. That's just here's how we're converting to decimal year. And lastly we are putting everything together. We are plotting the ndVi through time in this cell. And here it is as a plot.

So the plot shows the single pixel location through times. After calculating the ndVi, which is on the y axis and the time is on the x axis, this single collapse is highlighted in red and in ndVi values are in blues. So we see that prior to the single, the individual values were of course seem a bit cyclic. But after the sinkhole there is a drop in anywhere in could be an indication of the introduction to water, or the change of the brightness of that soil that they used to dump it up.

And then you see that India increase years after to its cyclic pattern of values. And so this is a great introduction to using indices and index. And also working with a single pixel through time when you're processing Landsat data in the cloud. And again, you're not downloading the data, you are pulling in only values for that single pixel.

So with that, I'm just going to return and finish up the presentation. So we just went through the retrieving a single band tutorial, and I'm just going to close off with just some available cloud resources. So what cloud resources are available. Let's start with Pentaho. These are available at no cost. Pango provides cloud computing resources for geosciences, can access it through the NGO link that is provided here.

Then we have the NASA Earth Data Primer that has a lot of information related to AWS cloud services. You can learn how to set up, understand pricing, costs, etc. the great resource third one is the Alaska Satellite Facility Open Science Lab. I personally started the cloud years ago when they had Open Star Lab. You would have to sign up and you'd get a Jupyter Hub where you can even pull download tutorials to work in the cloud.

It's a really great source for people who want access to a Jupyter Hub that they can work in the cloud. Then next you have open scapes and this is a combination of a community. And you also have a Jupyter Hub. It's called the two ICC Jupyter Hub access. You would need to sign up to be a part of their science cohort to go through the link provided.

And again, a great resource because it has a community of geoscientists who are willing to help you get started in the cloud. And lastly, cloud Native, just like NASA opens, all of these are communities really. It's again, it's a community. You can collaborate, you can work with various data scientists and connect to understand working, how to work in the cloud and get started with working in the cloud.

They even have conference. And with that, thank you for joining us today. You can contact us at the following clustered email. And also please go to the GitHub. Great. Thank you for a great presentation and for showing us those really cool demos, Tonian. For those listening in, I've added some final links to the chat which include more information about Landsat commercial cloud data access, as well as how to sign up to be notified about future webinars, and a link to watch some of our previously recorded webinars where this webinar be uploaded to in a few weeks.

At this time, we'll now move on to those final optional polling questions portion of the webinar. I'll get that launched and I'll get those loaded in chat. You should now see the polling questions on your screen or in the chat. Please take a few minutes to respond to those questions. Reminder these are the same set of questions, so please only answer once via either the polls feature or via the chat, whichever option works best for you.

If you have any final questions or feedback about accessing or working with Landsat data on the cloud, perhaps any scripting ideas you'd like to see added to the USGS resource? Feel free to add those to the chat now as well. With that, we'll give you a few moments to answer those polling questions and submit any final questions or feedback you have in the chat.

And since we're at 1255, we'll go ahead and start answering some of the questions that we have. So feel free to keep answering those polling questions. But now we'll go ahead and start to feel questions from the audience. The first question that we have is thank you for the great presentation. I'm doing research on dust storms and needed to download land cover data for 460 locations in the US for different years.

The user has coordinates for the foreign 60 locations and they need to download Motus land cover data. They're wondering if there's a way to automate the process instead of manually selecting and downloading the land cover instances. So for our user, if you haven't yet, there are several options I recommend accessing MODUS Land Cover data via the appears point sample.

I think that'll work excellently for your resource, as you can upload those coordinates or the appears API or via Earth access. And actually, this summer we're planning on giving a webinar about how to use appears, so make sure to stay tuned to that. I see Cole has posted the link in the chat, so thank you very much Cole.

And I'll also post a link to Earth Access in case that's something you're interested in learning more about as well. Our next question is the new data available first via S3 or machine to machine? I can also answer this one. So technically the data is available first through S3 over the machine to machine as the machine. The machine API has to wait for updates on cloud locations.

It may also be slowed depending on how many users are requesting data via the machine or machine at that specific time. But the difference really is relatively negligible. It should be within the same hour, so you could access them either way. Really, there's not much of a difference between what you'll see in S3 versus machine to machine. The next question we have is will the slides be shared?

00:48:16:04 - 00:48:37:04
Unknown
Yes, we will be posting them on the Landsat website when the recording is posted, and we'll make sure to advertise that out through the different channels where we advertise the webinar. So make sure to follow up there. The next question we have is is it easy to simplify the code and the experiment that. Yeah, feel free to add to the chat if you have any explanation for your question.

The code is pretty simplified already, and one of the great things about these scripts is they're just really examples. So we encourage you to adopt them, fit them to meet your own needs, simplify them as much as you would like. But if you do have a detail for your question, let us know. The next question is do you need a password to access the page?

No, you don't need a password to access the page. The GitLab page that I actually click through and download, I actually wasn't signed in. And so anyone even without a lab account, you can just you can access that link. Perfect. Thanks, Tonian. Let's see, someone asked if the webinar is being recorded and where the YouTube channel is. I did post that as part of the final links.

So there is the USGS trainings YouTube channel. You should see the link for it in the chat. Let's see are these data time scale? Once the extent of the spatial extent, it depends on the data set you're actually trying to retrieve. For example the fractional snow cover for the US, it's only available mainly in the north and the west of the US.

It really the extent depends on where your study area is and what you're trying to process or what information you're trying to get. Definitely. And Jacob, would you like to expand out a little more on Landsat? Yeah. So specifically on Landsat, it's recorded on a 16 day revisit period, meaning that, you know, if you're looking at Washington DC, you get a picture today.

You can expect a new picture about 16 days from now. And with Landsat eight and nine both currently operating, they operate opposite each other so that 16 days is actually cut down to about eight days. And so new Landsat imagery is actually being acquired every day. But that doesn't necessarily mean that it's going to be over your area of interest.

Perfect. Thank you Jacob. And we are at the hour. But we'll try and answer the last couple questions we have real quick, but feel free to hop off if you need to. We totally understand, but we'll work on answering the rest of these questions. The next one is, is there a participant certificate awarded for attending these webinars? Currently, that's not something that we offer, but we do.

Thank you for joining us. We also had a user have asked, do you really need a Json file? Can't we export in a shape file, a text or a CSV file? You can definitely use a shapefile. You can just extract the bounds from the shapefile to create a bounding box and interact with, and then pull in data for that shapefile.

One big problem with say files two and also large GeoJSON files is that if your area is really huge, you'll get an error from the STAC server. For example, if you search up the ukulele region in Peru, that's an area that would be too large, or if of a shapefile for an entire Florida that would be too large to use in the STAC server.

Excellent. And we did have some great discussion going on in the chat about Json file. So thank you for everyone who recommended resources there. Someone asked if this video will be downloadable. It will be shared on that USGS website and on the USGS YouTube trainings channel so you can access the video there. The next question we have is in your second tutorial, you substituted bad quality pixels with a value of zero.

Is it better or possible to substitute them with Nans? Oh yeah, you can definitely substitute them with nons. Using x or a. I just did zero that time, but I have a tutorial that's not published yet related to using the fractional snow layer for identifying snow, you can substitute Nan, so when that's uploaded you'll see the nuts that are used, how they're substituted.

Perfect. Thank you Tonian. We also had a request for a course on GeoJSON. So that's something we'll consider. And then what about the temperatures as observed through Multidecadal Landsat STAC data sets. And maybe I'm not sure if you're asking that for a specific question before. If you want to clarify in the chat. All right. It looks like we answered all the questions that were asked.

Matty, if you want to post in the chat real quick, we'll try and finalize an answer for your question. But with that, I did want to thank everyone for joining us. Don't forget, if you have any additional questions later about Landsat data in the cloud, any of the tutorials that Tonian shared today, or any of the other tutorials that you can find on the EROS GitLab.

Or if you have any questions about any other data, tools or services the USGS Aeros provides, you can always email our team at custserv@usgs.gov. And I'd like to once again thank our speaker Tonian for her time and great presentation, and I'd like to thank you all for joining us today. We'll stick around a few more minutes to answer any other questions that come in, but we do hope to see everyone at a future webinar.

If you'd like to be notified of when the next webinar will occur, please sign up for that Eros User Group by emailing custserv@usgs.gov. And again, thank you all for taking the time to join us today. We hope you have a wonderful rest of the day. I see Samara, you're wanting a course about GeoJSON. We can definitely have that in the quick as guides.

We have too many tutorials on Jason's and creating them from point to combination of points and lines to polygons wacky shapes. There in the quick lines, I'll add a link to the chat for the both of those, and then maybe I hope you're pronouncing your name correctly, but they have the question about what's the temperatures as observed through Multidecadal Landsat STAC data sets in the context of deforestation related to climate conditions.

Can we get some clarification on that question then? Anyone working on that? I don't think I'm sure. What's been asked this year. Now, maybe I might have you email into, that custserv email a little bit more detail about what you're trying to do, and then we can try and help you there. All right. Any other questions or final thoughts?

Okay. Perfect. We're going to call it. Thanks again everyone for joining us. We hope you have a wonderful rest of the day.