Questions tagged [machine-learning]

How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?

From The Discipline of Machine Learning by Tom Mitchell:

The field of Machine Learning seeks to answer the question "How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?" This question covers a broad range of learning tasks, such as how to design autonomous mobile robots that learn to navigate from their own experience, how to data mine historical medical records to learn which future patients will respond best to which treatments, and how to build search engines that automatically customize to their user's interests. To be more precise, we say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.

3322 questions

votes

0 answers

Analytic solution for $x = \frac{n}{1 + e^{-ax + t}}$ i.e. When is the output of a parametrised logistic function equal to the input?

I would like to know when is the input to a parametrised logistic function (the right hand side) equal to its output. I've been trying to solve the following equation: $$x = \frac{n}{1 + e^{-ax + t}}$$ Is it possible to solve this equation…

machine-learning

asked Sep 27 '20 at 17:32

mrsquee

votes

1 answer

understanding equation from adversarial learning paper

$$\min_{\theta \in \Theta} \sup_{P: D(P, P_0) \leq \rho} \mathbb{E}_P\left[ \ell(\theta;(X,Y)) \right]$$ The above equation is from this paper in ML: (https://arxiv.org/pdf/1805.12018.pdf) on the top of page 2. I'm having a hard time understanding…

machine-learning

asked Jun 08 '20 at 04:03

cmed123

votes

1 answer

Non-linear autoencoder versus linear autoencoder (PCA)

It is possible for some non-linear autoencoder to compress the input data into the same dimension as PCA but can preserve more information than PCA does?

machine-learning

asked May 28 '20 at 14:53

Sam

votes

1 answer

Fitting a charge like curve

I am thinking of fitting a charge like curve on a few data points of a time series. By charge curve I mean that I expect my time serie to be increasing and converging toward a limit value. I thought of the classic condensator charge function: y =…

machine-learning

asked May 23 '20 at 10:44

user10011330

votes

1 answer

hypothesis space - linear and logistic regression

I am new to machine learning and I came across the term "hypothesis space". I am trying to grasp what is it and especially am interested in dimension of this "space." For example in the context of linear regression, trying to fit a linear polynomial…

machine-learning

asked Apr 28 '20 at 23:29

funmath

votes

2 answers

PCA for classification

Assume that our samples are high dimensional points (i.e., d is large) and we use PCA to reduce it to k = 10 dimensions. After this step, we found that all the 10 new dimensions have continuous values (e.g., in other words, each feature in the…

machine-learning

asked Apr 18 '20 at 13:16

user754884

votes

1 answer

The proof of random fourier features

I am reading the following paper. https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf And I came down to the proof of Claim 1. The proof states in the 6th line of page 8 that "We have $|f(\Delta)|<\epsilon$ for all $\Delta \in…

machine-learning

asked Apr 14 '20 at 06:17

nonpara

votes

1 answer

Mistake on equation 9.75 PRML of CM Bishop?

On the denominator of equation 9.75, should not be the position of the productory and sumamtory exchanged? Posterior probability

machine-learning

asked Mar 22 '20 at 17:25

Miguel Angel Hombrados Herrera

votes

1 answer

what happen to softmax function when the input gets multiply by a very large scalar?

The softmax function is defined as $S: \mathcal{R}^n \to \mathcal{R}^n$, where $S(x)_i= \frac{e^{x_i}}{\sum_je^{x_j}}$. Now consider multiplying $x$ by a scalar $c$, $S(cx)=\frac{e^{cx_i}}{\sum_j e^{cx_j}}$. What happens when $c$ gets arbitarily…

machine-learning

asked Jan 09 '20 at 05:29

patamon

votes

0 answers

The math part of machine learning

I want to start learning the math part of machine learning so that i can optimise models with my sense of understanding . Any good sources you can recommend

machine-learning

asked Jan 03 '20 at 04:35

nova

votes

1 answer

What is leave-one out cross validation mean square error in case of linear regression (Y = bX+c)?

Suppose you have the following data with one real-value input variable & one real-value output variable. What is leave-one-out cross-validation mean square error in the case of linear regression (Y = bX+c)? (0,2),(2,2),(3,1)

machine-learning

asked Dec 23 '19 at 19:40

mathlove

votes

1 answer

In Support Vector Machine, why is the distance from the origin to the decision boundary b / ||w||

In the picture of SVM from Wikipedia, at the lower left corner - pointed by the red arrow, the distance from the (0,0) to the decision boundary is b / ||w||. Why is that? Thanks.

machine-learning

asked Dec 08 '19 at 05:59

Fred Chang

votes

4 answers

Machine learning model accuracy

A machine learning model gets an accuracy of 90% on a dataset with 90% positive class and 10% negative class. Can we conclude that the model is a good classifier of the data?

machine-learning

asked Aug 27 '19 at 14:05

emily

votes

1 answer

What is "dual form"

I am reading up on AI and now read https://en.wikipedia.org/wiki/Kernel_perceptron It says: To derive a kernelized version of the perceptron algorithm, we must first formulate it in dual form, starting from the observation that the weight vector w…

machine-learning

asked Jul 20 '19 at 14:10

Paul Ogilvie

votes

0 answers

Why is DDPG an off-policy method while policy gradient is by definition on-policy?

Why is DDPG an off-policy method while policy gradient is by definition on-policy? DDPG is updated in an off-policy manner while policy gradient is on-policy. So DDPG is not a policy gradient method?

machine-learning

asked May 22 '19 at 14:22

ccc

Prev 1 2 3

…

14 15 Next